hs2_parquet_constraint and hs2_text_constraint is meant to extend test
vector dimension to also test non-default test protocol (other than
beeswax), but limit it to only run against 'parquet/none' or 'text/none'
format accordingly.
This patch modifies these constraints to
default_protocol_or_parquet_constraint and
default_protocol_or_text_constraint respectively such that the full file
format coverage happen for default_test_protocol configuration and
limited for the other protocols. Drop hs2_parquet_constraint entirely
from test_utf8_strings.py because that test is already constrained to
single 'parquet/none' file format.
Num modified rows validation in date-fileformat-support.test and
date-partitioning.test are changed to check the NumModifiedRows counter
from profile.
Fix TestQueriesJsonTables to always run with beeswax protocol because
its assertions relies on beeswax-specific return values.
Run impala-isort and fix few flake8 issues and in modified test files.
Testing:
Run and pass the affected test files using exhaustive exploration and
env var DEFAULT_TEST_PROTOCOL=hs2. Confirmed that full file format
coverage happen for hs2 protocol. Note that
DEFAULT_TEST_PROTOCOL=beeswax is still the default.
Change-Id: I8be0a628842e29a8fcc036180654cd159f6a23c8
Reviewed-on: http://gerrit.cloudera.org:8080/22775
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.
All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.
The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.
get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.
Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Before this patch, add_mandatory_exec_option() replace existing query
option values in 'exec_option' dimension and may cause unintended test
vector duplication. For example, the following declaration will create
two duplicate test vector, both with "disable_codegen=False":
cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension(
disable_codegen_options=[False, True]))
add_mandatory_exec_option(cls, "disable_codegen", False)
add_exec_option_dimension() will create new test dimension for a 'key',
but does not insert it into 'exec_option' dimension until vector
generation later. It also does not validate if 'key' already exist in
'exec_option' dimension. This can confuse test writer when they need to
write constraint, because they might look for the value at
vector.get_value('exec_option')['key'] instead of
vector.get_value('key'), and vice versa.
This patch add assertion to check that no duplicate query option name is
declared through any helper function. It also assert that all query
option names are declared in lowercase.
Testing:
- Manually verify test vector generation in test files containing the
helper functions by running:
impala-py.test --exploration=exhaustive --collect-only <test_file>
- Adjust query option declaration that breaks after this change.
Change-Id: I8143e47f19090e20707cfb0a05c779f4d289f33c
Reviewed-on: http://gerrit.cloudera.org:8080/21707
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
When performing zero slots scans on a JSON table for operations like
count(*), we don't require specific data from the JSON, we only need the
number of top-level JSON objects. However, the current JSON parser based
on rapidjson still decodes and copies specific data from the JSON, even
in zero slots scans. Skipping these steps can significantly improve scan
performance.
This patch introduces a JSON skipper to conduct zero slots scans on JSON
data. Essentially, it is a simplified version of a rapidjson parser,
removing specific data decoding and copying operations, resulting in
faster parsing of the number of JSON objects. The skipper retains the
ability to recognize malformed JSON and provide specific error codes
same as the rapidjson parser. Nevertheless, as it bypasses specific
data parsing, it cannot identify string encoding errors or numeric
overflow errors. Despite this, these data errors do not impact the
counting of JSON objects, so it is acceptable to ignore them. The TEXT
scanner exhibits similar behavior.
Additionally, a new query option, disable_optimized_json_count_star, has
been added to disable this optimization and revert to the old behavior.
In the performance test of TPC-DS with a format of json/none and a scale
of 10GB, the performance optimization is shown in the following tables:
+-----------+---------------------------+--------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval |
+-----------+---------------------------+--------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | json / none / none | 6.78 | 6.88 | -1.46% | 4.93% | 3.63% | 9 | -1.51% | -0.74 | -0.72 |
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT | json / none / none | 2.42 | 6.75 | I -64.20% | 6.44% | 4.58% | 9 | I -177.75% | -3.36 | -37.55 |
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED | json / none / none | 2.42 | 7.03 | I -65.63% | 3.93% | 4.39% | 9 | I -194.13% | -3.36 | -42.82 |
+-----------+---------------------------+--------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
(I) Improvement: TPCDS(10) TPCDS-Q_COUNT_ZERO_SLOT [json / none / none] (6.75s -> 2.42s [-64.20%])
+--------------+------------+---------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+---------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| 01:AGGREGATE | 2.58% | 54.85ms | 58.88ms | -6.85% | * 14.43% * | 115.82ms | 133.11ms | -12.99% | 3 | 3 | 3 | 1 |
| 00:SCAN HDFS | 97.41% | 2.07s | 6.07s | -65.84% | 5.87% | 2.43s | 6.95s | -65.01% | 3 | 3 | 28.80M | 143.83M |
+--------------+------------+---------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
(I) Improvement: TPCDS(10) TPCDS-Q_COUNT_OPTIMIZED [json / none / none] (7.03s -> 2.42s [-65.63%])
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+--------+-----------+
| 00:SCAN HDFS | 99.35% | 2.07s | 6.49s | -68.15% | 4.83% | 2.37s | 7.49s | -68.32% | 3 | 3 | 28.80M | 143.83M |
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+--------+-----------+
Testing:
- Added new test cases in TestQueriesJsonTables to verify that query
results are consistent before and after optimization.
- Passed existing JSON scanning-related tests.
Change-Id: I97ff097661c3c577aeafeeb1518408ce7a8a255e
Reviewed-on: http://gerrit.cloudera.org:8080/21039
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
IMPALA-12019 implemented support for collections of fixed length types
in the sorting tuple. This change implements it for collections of
variable length types.
Note that the limitation that structs that contain any type of
collection are not allowed in the sorting tuple is still in place (see
IMPALA-12160).
Note that it was not and still is not allowed to sort by complex types,
this change only allows them to be present in the select list when
sortin by some other expression.
This change also allows collections of variable length types to be
non-passthrough children of UNION ALL nodes.
Testing:
- Renamed the 'simple_arrays_big' table to 'arrays_big' and extended it
with collections containing variable length types. This table is
mainly used to test that spilling works during sorting.
- Renamed
test_sort.py::TestArraySort::{test_simple_arrays,
test_simple_arrays_with_limit}
to {test_array_sort,test_array_sort_with_limit}
- Extended the tests run in test_queries.py::TestQueries::{test_sort,
test_top_n,test_partitioned_top_n} with collections containing
var-len types.
- Added tests in sort-complex.test that assert that it is not allowed
to sort by collections. For structs we already have such tests in
struct-in-select-list.test.
Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852
Reviewed-on: http://gerrit.cloudera.org:8080/20108
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch uses the "external data source" mechanism in Impala to
implement data source for querying JDBC.
It has some limitations due to the restrictions of "external data
source":
- It is not distributed, e.g, fragment is unpartitioned. The queries
are executed on coordinator.
- Queries which read following data types from external JDBC tables
are not supported:
BINARY, CHAR, DATETIME, and COMPLEX.
- Only support binary predicates with operators =, !=, <=, >=,
<, > to be pushed to RDBMS.
- Following data types are not supported for predicates:
DECIMAL, TIMESTAMP, DATE, and BINARY.
- External tables with complex types of columns are not supported.
- Support is limited to the following databases:
MySQL, Postgres, Oracle, MSSQL, H2, DB2, and JETHRO_DATA.
- Catalog V2 is not supported (IMPALA-7131).
- DataSource objects are not persistent (IMPALA-12375).
Additional fixes are planned on top of this patch.
Source files under jdbc/conf, jdbc/dao and jdbc/exception are
replicated from Hive JDBC Storage Handler.
In order to query the RDBMS tables, the following steps should be
followed (note that existing data source table will be rebuilt):
1. Make sure the Impala cluster has been started.
2. Copy the jar files of JDBC drivers and the data source library into
HDFS.
${IMPALA_HOME}/testdata/bin/copy-ext-data-sources.sh
3. Create an `alltypes` table in the Postgres database.
${IMPALA_HOME}/testdata/bin/load-ext-data-sources.sh
4. Create data source tables (alltypes_jdbc_datasource and
alltypes_jdbc_datasource_2).
${IMPALA_HOME}/bin/impala-shell.sh -f\
${IMPALA_HOME}/testdata/bin/create-ext-data-source-table.sql
5. It's ready to run query to access data source tables created
in last step. Don't need to restart Impala cluster.
Testing:
- Added unit-test for Postgres and ran unit-test with JDBC driver
postgresql-42.5.1.jar.
- Ran manual unit-test for MySql with JDBC driver
mysql-connector-j-8.1.0.jar.
- Ran core tests successfully.
Change-Id: I8244e978c7717c6f1452f66f1630b6441392e7d2
Reviewed-on: http://gerrit.cloudera.org:8080/17842
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Prototype of HdfsJsonScanner implemented based on rapidjson, which
supports scanning data from splitting json files.
The scanning of JSON data is mainly completed by two parts working
together. The first part is the JsonParser responsible for parsing the
JSON object, which is implemented based on the SAX-style API of
rapidjson. It reads data from the char stream, parses it, and calls the
corresponding callback function when encountering the corresponding JSON
element. See the comments of the JsonParser class for more details.
The other part is the HdfsJsonScanner, which inherits from HdfsScanner
and provides callback functions for the JsonParser. The callback
functions are responsible for providing data buffers to the Parser and
converting and materializing the Parser's parsing results into RowBatch.
It should be noted that the parser returns numeric values as strings to
the scanner. The scanner uses the TextConverter class to convert the
strings to the desired types, similar to how the HdfsTextScanner works.
This is an advantage compared to using number value provided by
rapidjson directly, as it eliminates concerns about inconsistencies in
converting decimals (e.g. losing precision).
Added a startup flag, enable_json_scanner, to be able to disable this
feature if we hit critical bugs in production.
Limitations
- Multiline json objects are not fully supported yet. It is ok when
each file has only one scan range. However, when a file has multiple
scan ranges, there is a small probability of incomplete scanning of
multiline JSON objects that span ScanRange boundaries (in such cases,
parsing errors may be reported). For more details, please refer to
the comments in the 'multiline_json.test'.
- Compressed JSON files are not supported yet.
- Complex types are not supported yet.
Tests
- Most of the existing end-to-end tests can run on JSON format.
- Add TestQueriesJsonTables in test_queries.py for testing multiline,
malformed, and overflow in JSON.
Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569
Reviewed-on: http://gerrit.cloudera.org:8080/19699
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
As a first stage of IMPALA-10939, this change implements support for
including in the sorting tuple top-level collections that only contain
fixed length types (including fixed length structs). For these types the
implementation is almost the same as the existing handling of strings.
Another limitation is that structs that contain any type of collection
are not yet allowed in the sorting tuple.
Also refactored the RawValue::Write*() functions to have a clearer
interface.
Testing:
- Added a new test table that contains many rows with arrays. This is
queried in a new test added in test_sort.py, to ensure that we handle
spilling correctly.
- Added tests that have arrays and/or maps in the sorting tuple in
test_queries.py::TestQueries::{test_sort,
test_top_n,test_partitioned_top_n}.
Change-Id: Ic7974ef392c1412e8c60231e3420367bd189677a
Reviewed-on: http://gerrit.cloudera.org:8080/19660
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
1. Python 3 requires absolute imports within packages. This
can be emulated via "from __future__ import absolute_import"
2. Python 3 changed division to "true" division that doesn't
round to an integer. This can be emulated via
"from __future__ import division"
This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.
I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.
Testing:
- Ran core tests
Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
The erasure coding policy used in our test is RS-3-2-1024k, which
requires the block size should not less than 1MB. The test introduced
in IMPALA-11081 use 'dfs.block.size=1024' to create multiple blocks
table, we should skip this test under erasure coding to avoid test
failures.
Change-Id: I0f088102c380df89f56870d901852f7dde2d72fe
Reviewed-on: http://gerrit.cloudera.org:8080/19515
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes incorrect results caused by short-circuit partition
key scan in the case where a Parquet/ORC file contains multiple
blocks.
IMPALA-8834 introduced the optimization that generating only one
scan range that corresponding to the first block per file. Backends
only issue footer ranges for Parquet/ORC files for file-metadata-only
queries(see HdfsScanner::IssueFooterRanges()), which leads to
incorrect results if the first block doesn't include a file footer.
This bug is fixed by returning a scan range corresponding to the last
block for Parquet/ORC files to make sure it contains a file footer.
Testing:
- Added e2e tests to verify the fix.
Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1
Reviewed-on: http://gerrit.cloudera.org:8080/19471
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Restore EC tests that were disabled until HDFS-13539 and HDFS-13540 were
fixed, as the fixes are available in the current version of Hadoop we
test.
Testing: ran these tests with EC enabled.
Change-Id: I8b0bbc604601e6fab742f145c1adfb3c47b3fb6e
Reviewed-on: http://gerrit.cloudera.org:8080/19159
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
IMPALA-9482 added support to the remaining Hive types and removed the
functional.unsupported_types table. There was a reference remaining in a
misc test. test_misc is not marked as exhaustive but it only runs in
exhaustive builds.
Change-Id: I65b6ea5ac742fbcc427ad41741d347558cb7d110
Reviewed-on: http://gerrit.cloudera.org:8080/18896
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
As part of IMPALA-9979 we made changes to push down predicates
that reference analytic tuple into the inline view. In cases where
both sides of a predicate are slot references (for example,
a = MAX(b) where MAX(b) is an analytic function), it may not be
safe to push it into the inline view since the two sides may be
referencing separate tuples.
This patch fixes the behavior by skipping such predicates such
that they will be left unassigned and will subsequently get
assigned to a SELECT node above the analytic operator.
Testing:
- Added planner tests for analytic predicates ensuring that
analytic predicates are present in the SELECT node.
- Added run time tests for the same using TPC-H and
verified correctness.
Change-Id: Ib5cad3d408ee3695cafb35f66a4f19b4e8d0529e
Reviewed-on: http://gerrit.cloudera.org:8080/17615
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Aman Sinha <amsinha@cloudera.com>
Result spooling has been relatively stable since it was introduced, and
it has several benefits described in IMPALA-8656. This patch enable
result spooling (SPOOL_QUERY_RESULTS) query options by default.
Furthermore, some tests need to be adjusted to account for result
spooling by default. The following are the adjustment categories and
list of tests that fall under such category.
Change in assertions:
PlannerTest#testAcidTableScans
PlannerTest#testBloomFilterAssignment
PlannerTest#testConstantFolding
PlannerTest#testFkPkJoinDetection
PlannerTest#testFkPkJoinDetectionWithHDFSNumRowsEstDisabled
PlannerTest#testKuduSelectivity
PlannerTest#testMaxRowSize
PlannerTest#testMinMaxRuntimeFilters
PlannerTest#testMinMaxRuntimeFiltersWithHDFSNumRowsEstDisabled
PlannerTest#testMtDopValidation
PlannerTest#testParquetFiltering
PlannerTest#testParquetFilteringDisabled
PlannerTest#testPartitionPruning
PlannerTest#testPreaggBytesLimit
PlannerTest#testResourceRequirements
PlannerTest#testRuntimeFilterQueryOptions
PlannerTest#testSortExprMaterialization
PlannerTest#testSpillableBufferSizing
PlannerTest#testTableSample
PlannerTest#testTpch
PlannerTest#testKuduTpch
PlannerTest#testTpchNested
PlannerTest#testUnion
TpcdsPlannerTest
custom_cluster/test_admission_controller.py::TestAdmissionController::test_dedicated_coordinator_planner_estimates
custom_cluster/test_admission_controller.py::TestAdmissionController::test_memory_rejection
custom_cluster/test_admission_controller.py::TestAdmissionController::test_pool_mem_limit_configs
metadata/test_explain.py::TestExplain::test_explain_level2
metadata/test_explain.py::TestExplain::test_explain_level3
metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation
Increase BUFFER_POOL_LIMIT:
query_test/test_queries.py::TestQueries::test_analytic_fns
query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filter_reservation
query_test/test_sort.py::TestQueryFullSort::test_multiple_mem_limits_full_output
query_test/test_spilling.py::TestSpillingBroadcastJoins::test_spilling_broadcast_joins
query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling_aggs
query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling_regression_exhaustive
query_test/test_udfs.py::TestUdfExecution::test_mem_limits
Increase MEM_LIMIT:
query_test/test_mem_usage_scaling.py::TestExchangeMemUsage::test_exchange_mem_usage_scaling
query_test/test_mem_usage_scaling.py::TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling
Increase MAX_ROW_SIZE:
custom_cluster/test_parquet_max_page_header.py::TestParquetMaxPageHeader::test_large_page_header_config
query_test/test_insert.py::TestInsertQueries::test_insert_large_string
query_test/test_query_mem_limit.py::TestQueryMemLimit::test_mem_limit
query_test/test_scanners.py::TestTextSplitDelimiters::test_text_split_across_buffers_delimiter
query_test/test_scanners.py::TestWideRow::test_wide_row
Disable result spooling to maintain assertion:
custom_cluster/test_admission_controller.py::TestAdmissionController::test_set_request_pool
custom_cluster/test_admission_controller.py::TestAdmissionController::test_timeout_reason_host_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_timeout_reason_pool_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_queue_reasons_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_pool_config_change_while_queued
custom_cluster/test_query_retries.py::TestQueryRetries::test_retry_fetched_rows
custom_cluster/test_query_retries.py::TestQueryRetries::test_retry_finished_query
custom_cluster/test_scratch_disk.py::TestScratchDir::test_no_dirs
custom_cluster/test_scratch_disk.py::TestScratchDir::test_non_existing_dirs
custom_cluster/test_scratch_disk.py::TestScratchDir::test_non_writable_dirs
query_test/test_insert.py::TestInsertQueries::test_insert_large_string (the last query only)
query_test/test_kudu.py::TestKuduMemLimits::test_low_mem_limit_low_selectivity_scan
query_test/test_mem_usage_scaling.py::TestScanMemLimit::test_kudu_scan_mem_usage
query_test/test_queries.py::TestQueriesParquetTables::test_very_large_strings
query_test/test_query_mem_limit.py::TestCodegenMemLimit::test_codegen_mem_limit
shell/test_shell_client.py::TestShellClient::test_fetch_size
Testing:
- Pass exhaustive tests.
Change-Id: I9e360c1428676d8f3fab5d95efee18aca085eba4
Reviewed-on: http://gerrit.cloudera.org:8080/16755
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Planner changes:
---------------
The planner now identifies predicates that can be converted into
limits in a partitioned or unpartitioned top-n with the following
method:
* Push down predicates that reference analytic tuple into inline view.
These will be evaluated after the analytic plan for the inline
SelectStmt is generated.
* Identify predicates that reference the analytic tuple and could
be converted to limits.
* If they can be applied to the last sort group of the analytic
plan, and the windows are all compatible, then the lowest
limit gets converted into a limit in the top N.
* Otherwise generate a select node with the conjuncts. We add
logic to merge SELECT nodes to avoid generating duplicates
from inside and outside the inline view.
* The pushed predicate is still added to the SELECT node
because it is necessary for correctness for predicates
like '=' to filter additional rows and also the limit
pushdown optimization looks for analytic predicates
there, so retaining all predicates simplifies that.
The selectivity of the predicate is adjusted so that
cardinality estimates remain accurate.
The optimization can be disabled by setting
ANALYTIC_RANK_PUSHDOWN_THRESHOLD=0. By default it is
only enabled for limits of 1000 or less, because the
in-memory Top-N may perform significantly worse than
a full sort for large heaps (since updating the heap
for every input row ends up being more expensive than
doing a traditional sort). We could probably optimize
this more with better tuning so that it can gracefully
fall back to doing the full sort at runtime.
rank() and row_number() are handled. rank() needs support in
the TopN node to include ties for the last place, which is
also added in this patch.
If predicates are trivially false, we generate empty nodes.
This interacts with the limit pushdwon optimization. The limit
pushdown optimization is applied after the partitioned top-n
is generated, and can sometimes result in more optimal plans,
so it is generalized to handle pushing into partitioned top-n
nodes.
Backend changes:
---------------
The top-n node in the backend is augmented to handle
the partitioned case, for which we use a std::map and a
comparator based on the partition exprs. The partitioned
top-n node has a soft limit of 64MB on the size of the
in-memory heaps and can spill with use of an embedded Sorter.
The current implementation tries to evict heaps that are
less effective at filtering rows.
Limitations:
-----------
There are several possible extensions to this that we did not do:
* dense_rank() is not supported because it would require additional
backend support - IMPALA-10014.
* ntile() is not supported because it would require additional
backend support - IMPALA-10174.
* Only one predicate per analytic is pushed.
* Redundant rank()/row_number() predicates are not merged,
only the lowest is chosen.
* Lower bounds are not converted into OFFSET.
* The analytic operator cannot be eliminated even if the analytic
expression was only used in the predicate.
* This doesn't push predicates into UNION - IMPALA-10013
* Always false predicates don't result in empty plan - IMPALA-10015
Tests:
-----
* Planner tests - added tests that exercise the interesting code
paths added in planning.
- Predicate ordering in SELECT nodes changed in a couple of cases
because some predicates were pushed into the inline views.
* Modified SORT targeted perf tests to avoid conversion to Top-N
* Added targeted perf test for partitioned top-n.
* End-to-end tests
- Unpartitioned Top-N end-to-end tests
- Basic partitioning and duplicate handling tests on functional
- Similar basic tests on larger inputs from TPC-DS and with
larger partition counts.
- I inspected the results and also ran the same tests with
analytic_rank_pushdown_threshold=0 to confirm that the
results were the same as with the full sort.
- Fallback to spilling sort.
Perf:
-----
Added a targeted benchmark that goes from ~2s to ~1s with
mt_dop=8 on TPC-H 30 on my desktop.
Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Reviewed-on: http://gerrit.cloudera.org:8080/16242
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The planner generates runtime filters for non-join conjuncts
assigned to LEFT OUTER and FULL OUTER JOIN nodes. This is
correct in many cases where NULLs stemming from unmatched rows
would result in the predicate evaluating to false. E.g.
x = y is always false if y is NULL.
However, it is incorrect if the NULL returned from the unmatched
row can result in the predicate evaluating to true. E.g.
x = isnull(y, 1) can return true even if y is NULL.
The fix is to detect cases when the source expression from the
left input of the join returns non-NULL for null inputs and then
skip generating the filter.
Examples of expressions that may be affected by this change are
COALESCE and ISNULL.
Testing:
Added regression tests:
* Planner tests for LEFT OUTER and FULL OUTER where the runtime
filter was incorrectly generated before this patch.
* Enabled end-to-end test that was previously failing.
* Added a new runtime filter test that will execute on both
Parquet and Kudu (which are subtly different because of nullability of
slots).
Ran exhaustive tests.
Change-Id: I507af1cc8df15bca21e0d8555019997812087261
Reviewed-on: http://gerrit.cloudera.org:8080/16622
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This fix modified the following tests launched from test_queries.py by
removing references to database 'functional' whenever possible. The
objective of the change is to allow more testing coverage with different
databases than the single 'functional' database. In the fix, neither new
tables were added nor expected results were altered.
empty.test
inline-view-limit.test
inline-view.test
limit.test
misc.test
sort.test
subquery-single-node.test
subquery.test
top-n.test
union.test
with-clause.test
It was determined that other tests in
testdata/workloads/functional-query/queries/QueryTest do not refer to
'functional' or the references are a must for some reason.
Testing
Ran query_tests on these changed tests with exhaustive exploration
strategy.
Change-Id: Idd50eaaaba25e3bedc2b30592a314d2b6b83f972
Reviewed-on: http://gerrit.cloudera.org:8080/16603
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:
Before constant propagation:
WHERE date_col = CAST(timestamp_col as DATE)
AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
AND date_col = CAST(timestamp_col as DATE)
As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.
Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.
Testing:
- Added new range constant propagation tests to PlannerTest.
- Added e2e test for range constant propagation based on a newly
added date partitioned table.
- Ran precommit tests.
Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Reviewed-on: http://gerrit.cloudera.org:8080/16346
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
INTERSECT and EXCEPT set operations are implemented as rewrites to
joins. Currently only the DISTINCT qualified operators are implemented,
not ALL qualified. The operator MINUS is supported as an alias for
EXCEPT.
We mimic Oracle and Hive's non-standard implementation which treats all
operators with the same precedence, as opposed to the SQL Standard of
giving INTERSECT higher precedence.
A new class SetOperationStmt was created to encompass the previous
UnionStmt behavior. UnionStmt is preserved as a special case of union
only operands to ensure compatibility with previous union planning
behavior.
Tests:
* Added parser and analyzer tests.
* Ensured no test failures or plan changes for union tests.
* Added TPC-DS queries 14,38,87 to functional and planner tests.
* Added functional tests test_intersect test_except
* New planner testSetOperationStmt
Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Reviewed-on: http://gerrit.cloudera.org:8080/16123
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
This commit introduces optional asynchronous code generation.
Asynchronous code generation means that instead of waiting for codegen
to finish, the query starts in interpreted mode while codegen is done on
another thread.
All the function pointers that point to codegen'd functions are changed
to be atomic, wrapped in a CodegenFnPtr. These are initialised to
nullptr and as long as they are nullptr, the corresponding interpreted
functions are used (as before). When code generation is ready, the
funtion pointers are set by the codegen thread. No synchronisation is
needed as the function pointers are atomic and it is not a problem if,
at a given moment, only a subset of the codegen'd function pointers are
set and the rest are interpreted.
Asynchronous code generation can be turned on using the ASYNC_CODEGEN
boolean query option.
Testing:
- In exhaustive mode, a limited number of end-to-end tests are run in
async mode and with debug actions randomly delaying the codegen
thread and the main thread after starting codegen to test various
scenarios of relative timing. The number of such tests is kept
small to avoid increasing the running time of the tests by too much.
- Added a new end-to-end test, tests/query_test/test_async_codegen.py,
which tests three relative timings:
1. Async codegen finishes before query execution starts (only
codegen'd code runs).
2. Query execution finishes before async codegen finishes (only
interpreted code runs).
3. Async codegen finishes during query execution (both interpreted
and condegen'd code runs, switching to codegen from interpreted
mode.
Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Reviewed-on: http://gerrit.cloudera.org:8080/15105
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This adds a new version of the pre-existing partition
key scan optimization that always returns correct
results, even when files have zero rows. This new
version is always enabled by default. The old
existing optimization, which does a metadata-only
query, is still enabled behind the
OPTIMIZE_PARTITION_KEY_SCANS query option.
The new version of the optimization must scan the files
to see if they are non-empty. Instead of using metadata
only, the planner instructs the backend to short-circuit HDFS
scans after a single row has been returned from each
file. This gives results equivalent to returning all
the rows from each file, because all rows in the file
belong to the same partition and therefore have identical
values for any columns that are partition key values.
Planner cardinality estimates are adjusted accordingly
to enable potentially better plans and other optimisations
like disabling codegen.
We make some effort to avoid generated extra scan ranges
for remote scans by only generating one range per remote
file.
The backend optimisation is implemented by constructing a
row batch with capacity for a single row only and then
terminating each scan range once a single row has been
produced. Both Parquet and ORC have optimized code paths
for zero slot table scans that mean this will only result
in a footer read. (Other file formats still need to read
some portion of the file, but can terminate early once
one row has been produced.)
This should be quite efficient in practice with file handle
caching and data caching enabled, because it then only
requires reading the footer from the cache for each file.
The partition key scan optimization is also slightly
generalised to apply to scans of unpartitioned tables
where no slots are materialized.
A limitation of the optimization where it did not apply
to multiple grouping classes was also fixed.
Limitations:
* This still scans every file in the partition. I.e. there is
no short-circuiting if a row has already been found in the
partition by the current scan node.
* Resource reservations and estimates for the scan node do
not all take into account this optimisation, so are
conservative - they assume the whole file is scanned.
Testing:
* Added end-to-end tests that execute the query on all
HDFS file formats and verify that the correct number of rows
flow through the plan.
* Added planner test based on the existing test partition key
scan test.
* Added test to make sure single node optimisation kicks in
when expected.
* Add test for cardinality estimates with and without stats
* Added test for unpartitioned tables.
* Added planner test that checks that optimisation is enabled
for multiple aggregation classes.
* Added a targeted perf test.
Change-Id: I26c87525a4f75ffeb654267b89948653b2e1ff8c
Reviewed-on: http://gerrit.cloudera.org:8080/13993
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This enables "modern" catalog features including the
local catalog and HMS notification support in the
dockerised minicluster by default.
The flags can be overridden if needed.
Skip tests affected by these bugs:
* IMPALA-8486 (LibCache invalidations)
* IMPALA-8458 (alter column stats)
* IMPALA-7131 (data sources not supported)
* IMPALA-7538 (HDFS caching DDL not supported)
* IMPALA-8489 TestRecoverPartitions.test_post_invalidate fails with
IllegalStateException
* IMPALA-8459 (cannot drop Kudu table)
* IMPALA-7539 (insert permission checks)
Fix handling of table properties in _get_properties()
to avoid including properties from unrelated sections.
This caused problems becase of additional properties
added by metastore event processing.
Rewrite test_partition_ddl_predicates() to change file formats rather
than use HDFS caching DDL.
Update the various test_kudu_col* tests to not expect staleness of
Kudu metadata for catalog V2.
Fix IMPALA-8464 so that testMetaDataGetColumnComments() allows the
table comment to be present, which is the new behaviour. Add a
new end-to-end test test_get_tables() that tests the precise
behaviour for different catalog versions so as to not lose
coverage.
Change-Id: I900d4b718cca98bcf86d36a2e64c0b6a424a5b7c
Reviewed-on: http://gerrit.cloudera.org:8080/13226
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
IMPALA-4865 is fixed so these now pass. I noticed
that the IMPALA-4874 test occasionally hit
"Memory Limit Exceeded" when looped, so I reduced
the data size there slightly.
Testing:
Looped the tests locally against a dockerised minicluster
for a while.
Change-Id: I030f4eff2d3fb771fc92b760efb13170e68285dc
Reviewed-on: http://gerrit.cloudera.org:8080/13233
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This fixes all core e2e tests running on my local dockerised
minicluster build. I do not yet have a CI job or script running
but I wanted to get feedback on these changes sooner. The second
part of the change will include the CI script and any follow-on
fixes required for the exhaustive tests.
The following fixes were required:
* Detect docker_network from TEST_START_CLUSTER_ARGS
* get_webserver_port() does not depend on the caller passing in
the default webserver port. It failed previously because it
relied on start-impala-cluster.py setting -webserver_port
for *all* processes.
* Add SkipIf markers for tests that don't make sense or are
non-trivial to fix for containerised Impala.
* Support loading Impala-lzo plugin from host for tests that depend on
it.
* Fix some tests that had 'localhost' hardcoded - instead it should
be $INTERNAL_LISTEN_HOST, which defaults to localhost.
* Fix bug with sorting impala daemons by backend port, which is
the same for all dockerised impalads.
Testing:
I ran tests locally as follows after having set up a docker network and
starting other services:
./buildall.sh -noclean -notests -ninja
ninja -j $IMPALA_BUILD_THREADS docker_images
export TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster"
export FE_TEST=false
export BE_TEST=false
export JDBC_TEST=false
export CLUSTER_TEST=false
./bin/run-all-tests.sh
Change-Id: Iee86cbd2c4631a014af1e8cef8e1cd523a812755
Reviewed-on: http://gerrit.cloudera.org:8080/12639
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The '.test' files are used to run queries for tests. These files are
run with a vector of default query options. They also sometimes
include SET queries that modify query options. If SET is used on a
query option that is included in the vector, the default value from
the vector will override the value from the SET, leading to tests that
don't actually run with the query options they appear to.
This patch asserts that '.test' files don't use SET for values present
in the default vector. It also fixes various tests that already had
this incorrect behavior.
Testing:
- Passed a full exhaustive run.
Change-Id: I4e4c0f31bf4850642b624acdb1f6cb8837957990
Reviewed-on: http://gerrit.cloudera.org:8080/12220
Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
I used some ideas from Alex Leblang's abandoned patch:
https://gerrit.cloudera.org/#/c/137/ in order to run .test files through
HS2. The advantage of using Impyla is that much of the code will be
reusable for any Python client implementing the standard Python dbapi
and does not require us implementing yet another thrift client.
This gives us better coverage of non-trivial result sets from HS2,
including handling of NULLs, error logs and more interesting result
sets than the basic HS2 tests.
I added HS2 coverage to TestQueries, which has a reasonable variety of
queries and covers the data types in alltypes. I also added
TestDecimalQueries, TestStringQuery and TestCharFormats to get coverage
of DECIMAL, CHAR and VARCHAR that aren't in alltypes. Coverage of
results sets with NULLs was limited so I added a couple of queries.
Places where results differ from Beeswax:
* Impyla is a Python dbapi client so must convert timestamps into python datetime
objects, which only have microsecond precision. Therefore result
timestamps within nanosecond precision are truncated.
* The HS2 interface reports the NULL type as BOOLEAN as a workaround for
IMPALA-914.
* The Beeswax interface reported VARCHAR as STRING, but HS2 reports
VARCHAR.
I dealt with different results by adding additional result sections so
that the expected differences between the clients/protocols were
explicit.
Limitations:
* Not all of the same methods are implemented as for beeswax, so some
tests that have more complicated interactions with the client will not
work with HS2 yet.
* We don't have a way to get the affected row count for inserts.
I also simplified the ImpalaConnection API by removing some unnecessary
methods and moved some generic methods to the base class.
Testing:
* Confirmed that it detected IMPALA-7588 by re-applying the buggy patch.
* Ran exhaustive and CentOS6 tests.
Change-Id: I9908ccc4d3df50365be8043b883cacafca52661e
Reviewed-on: http://gerrit.cloudera.org:8080/11546
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.
Testing: It passes core tests.
Cherry-picks: not for 2.x.
Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Reviewed-on: http://gerrit.cloudera.org:8080/10413
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This has two related changes.
IMPALA-6679: defer scanner reservation increases
------------------------------------------------
When starting each scan range, check to see how big the initial scan
range is (the full thing for row-based formats, the footer for
Parquet) and determine whether more reservation would be useful.
For Parquet, base the ideal reservation on the actual column layout
of each file. This avoids reserving memory that we won't use for
the actual files that we're scanning. This also avoid the need to
estimate ideal reservation in the planner.
We also release scanner thread reservations above the minimum as
soon as threads complete, so that resources can be released slightly
earlier.
IMPALA-6678: estimate Parquet column size for reservation
---------------------------------------------------------
This change also reduces reservation computed by the planner in certain
cases by estimating the on-disk size of column data based on stats. It
also reduces the default per-column reservation to 4MB since it appears
that < 8MB columns are generally common in practice and the method for
estimating column size is biased towards over-estimating. There are two
main cases to consider for the performance implications:
* Memory is available to improve query perf - if we underestimate, we
can increase the reservation so we can do "efficient" 8MB I/Os for
large columns.
* The ideal reservation is not available - query performance is affected
because we can't overlap I/O and compute as much and may do smaller
(probably 4MB I/Os). However, we should avoid pathological behaviour
like tiny I/Os.
When stats are not available, we just default to reserving 4MB per
column, which typically is more memory than required. When stats are
available, the memory required can be reduced below when some heuristic
tell us with high confidence that the column data for most or all files
is smaller than 4MB.
The stats-based heuristic could reduce scan performance if both the
conservative heuristics significantly underestimate the column size
and memory is constrained such that we can't increase the scan
reservation at runtime (in which case the memory might be used by
a different operator or scanner thread).
Observability:
Added counters to track when threads were not spawned due to reservation
and to track when reservation increases are requested and denied. These
allow determining if performance may have been affected by memory
availability.
Testing:
Updated test_mem_usage_scaling.py memory requirements and added steps
to regenerate the requirements. Loops test for a while to flush out
flakiness.
Added targeted planner and query tests for reservation calculations and
increases.
Change-Id: Ifc80e05118a9eef72cac8e2308418122e3ee0842
Reviewed-on: http://gerrit.cloudera.org:8080/9757
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
TestHdfsQueries is a subclass of TestQueries and inherits of all its
'test_*' methods, causing these tests to be run twice any time
test_queries.py is run. This was not intentional (it was subclassed
just to inherit 'add_test_dimensions') and causes test runs to take
longer than necessary.
This patch removes the subclass relationship and copies the logic in
add_test_dimensions() from TestQueries in HdfsTestQueries, with a
convenience function added to minimize code duplication.
Testing:
- Ran test_queries.py under both 'core' and 'exhaustive' and checked
that the same tests are run, except all now only a single time each.
Change-Id: Ida659aa7b5131a6a7469baa93a41f7581bd0659a
Reviewed-on: http://gerrit.cloudera.org:8080/10053
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
We should not perform alias substitution in the
subexpressions of GROUP BY, HAVING, and ORDER BY
to be more standard conformant.
=== Allowed ===
SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x;
SELECT int_col / 2 AS x
FROM functional.alltypes
ORDER BY x;
SELECT NOT bool_col AS nb
FROM functional.alltypes
GROUP BY nb
HAVING nb;
=== Not allowed ===
SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x / 2;
SELECT int_col / 2 AS x
FROM functional.alltypes
ORDER BY -x;
SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x
HAVING x > 3;
Some extra checks were added to AnalyzeExprsTest.java.
I had to update other tests to make them pass
since the new behavior is more restrictive.
I added alias.test to the end-to-end tests.
Cherry-picks: not for 2.x.
Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
Reviewed-on: http://gerrit.cloudera.org:8080/8801
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Currently implementation of rand/random built-in functions
use rand_r of C library. We recognized its randomness was poor.
pcg32 of third party library shows better randomness than rand_r.
Testing:
Revise unit test in expr-test
Add E2E test to random.test
Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684
Reviewed-on: http://gerrit.cloudera.org:8080/8355
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
Currently, constant expressions for the LHS of the IN predicate
are not supported. This patch adds this support as a rewrite in
StmtRewriter (where subqueries are rewritten to joins). Since
there is a nested-loop variant of left semijoin, support for IN
is handled by not erring out. NOT IN is handled by a rewrite to
corresponding NOT EXISTS predicate. Support for NOT IN with a
correlated subquery is not included in this change.
Re-organized the frontend subquery analysis tests to expand coverage.
Testing:
- added frontend subquery analysis tests
- added e2e tests
Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb
Reviewed-on: http://gerrit.cloudera.org:8080/8322
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Testing:
Previously I needed ~20 iterations to get the test to fail on my local
machine. After these changes I haven't been able to reproduce the
failure
Change-Id: I2bea7b0f770dec362a6df075da4e340402bd1d5d
Reviewed-on: http://gerrit.cloudera.org:8080/8562
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
Currently TopN retains old string allocations in a tuple pool which is
held longer than necessary, resulting in unnecessary memory usage.
With this commit, the TopN node will periodically re-materialise the
rows stored in the priority queue and reclaim the old allocations.
This is done when the number of rows removed from the priority queue
is more than twice the N (limit + offset). Moreover, a new counter
called "TuplePoolReclamations" is added to the TopN node that keeps
track of the number of times the tuple pool is reclaimed.
Testing:
Test added to test_queries.py which sets a low mem_limit such
that the test would fail if reclamation is not implemented and pass
otherwise.
Performance:
Query 1 (expected general case):
select * from tpch.lineitem order by l_orderkey desc limit 10;
Query 2 (example worst case: data stored in reverse order before
feeding to the last TopN node):
select * from (select * from tpch.lineitem order by l_orderkey desc
limit 6001215) tb order by l_orderkey limit 10;
With Reclaim Without Reclaim
Query 1 Query 2 Query 1 Query 2
MaxTuplePoolMem 3.96 KB 3.43 KB 110.2 MB 708.8 MB
Time (mean) 2s 218ms 6s 391ms 2s 021ms 6s 406ms
Time (stdev) 74.38ms 67.45ms 102.71ms 70.44ms
Reclaims 910 5861 N/A N/A
We notice that memory footprint is orders of magnitude lower while
maintaining similar query runtimes. Cluster perf testing will be done
later.
Change-Id: I968f57f0ff2905bd581908bc5c5ee486b31e6aa8
Reviewed-on: http://gerrit.cloudera.org:8080/7400
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
This change executes the tests added to subplans.test and removes
a test which incorrectly references subplannull_data.test (a file
which does not exist)
Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91
Reviewed-on: http://gerrit.cloudera.org:8080/7038
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Adds Impala support for TIMESTAMP types stored in Kudu.
Impala stores TIMESTAMP values in 96-bits and has nanosecond
precision. Kudu's timestamp is a 64-bit microsecond delta
from the Unix epoch (called UNIXTIME_MICROS), so a conversion
is necessary.
When writing to Kudu, TIMESTAMP values in nanoseconds are
averaged to the nearest microsecond.
When reading from Kudu, the KuduScanner returns
UNIXTIME_MICROS with 8bytes of padding so Impala can convert
the value to a TimestampValue in-line and copy the entire
row.
Testing:
Updated the functional_kudu schema to use TIMESTAMPs instead
of converting to STRING, so this provides some decent
coverage. Some BE tests were added, and some EE tests as
well.
TODO: Support pushing down TIMESTAMP predicates
TODO: Support TIMESTAMPs in range partitioning expressions
Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Reviewed-on: http://gerrit.cloudera.org:8080/6526
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
The union node acts as pass through operator and forwards row batches
from it's children without materializing. This is done in the case
when the child's tuple layout is identical to union node tuple layout
and no functions need to be applied to the child row batches.
Removed operand reordering in the FE because it's simpler and safer to
handle all passthrough children before non-passthrough children in the
BE. The recent improvements to memory management allowed us to remove
this requirement.
Testing:
- Added new planner and end to end tests that cover the new
functionality.
- Updated existing tests to reflect the new behavior.
Perf:
Ran a benchmark on a local 10 GB tpcds dataset. I used an unpartitioned
version of the store_sales table. There was over a 2x performance
improvement for the following query:
SELECT
COUNT(ss_sold_time_sk),
COUNT(ss_item_sk),
COUNT(ss_customer_sk),
COUNT(ss_cdemo_sk),
COUNT(ss_hdemo_sk),
COUNT(ss_addr_sk),
COUNT(ss_store_sk),
COUNT(ss_promo_sk),
COUNT(ss_ticket_number),
COUNT(ss_quantity),
COUNT(ss_wholesale_cost),
COUNT(ss_list_price),
COUNT(ss_sales_price),
COUNT(ss_ext_discount_amt),
COUNT(ss_ext_sales_price),
COUNT(ss_ext_wholesale_cost),
COUNT(ss_ext_list_price),
COUNT(ss_ext_tax),
COUNT(ss_coupon_amt),
COUNT(ss_net_paid),
COUNT(ss_net_paid_inc_tax),
COUNT(ss_net_profit),
COUNT(ss_sold_date_sk)
FROM (
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
union all
select * from tpcds_10_parquet.store_sales_unpartitioned
) t
Before:
Total Time: 43s164ms
Summary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
------------------------------------------------------------------------------------------------------------------------------
13:AGGREGATE 1 224.721us 224.721us 1 1 28.00 KB -1.00 B FINALIZE
12:EXCHANGE 1 24.578us 24.578us 3 1 0 -1.00 B UNPARTITIONED
11:AGGREGATE 3 2s402ms 3s060ms 3 1 119.00 KB 10.00 MB
00:UNION 3 35s380ms 37s846ms 288.01M 288.01M 3.08 MB 0
|--02:SCAN HDFS 3 184.197ms 219.931ms 28.80M 28.80M 535.03 MB 1.88 GB store_sales_unpartitioned
|--03:SCAN HDFS 3 131.956ms 153.401ms 28.80M 28.80M 534.98 MB 1.88 GB store_sales_unpartitioned
|--04:SCAN HDFS 3 178.456ms 247.721ms 28.80M 28.80M 534.98 MB 1.88 GB store_sales_unpartitioned
|--05:SCAN HDFS 3 189.398ms 242.251ms 28.80M 28.80M 535.01 MB 1.88 GB store_sales_unpartitioned
|--06:SCAN HDFS 3 122.786ms 156.528ms 28.80M 28.80M 534.98 MB 1.88 GB store_sales_unpartitioned
|--07:SCAN HDFS 3 147.467ms 183.391ms 28.80M 28.80M 535.13 MB 1.88 GB store_sales_unpartitioned
|--08:SCAN HDFS 3 147.502ms 186.273ms 28.80M 28.80M 535.01 MB 1.88 GB store_sales_unpartitioned
|--09:SCAN HDFS 3 130.086ms 154.682ms 28.80M 28.80M 535.04 MB 1.88 GB store_sales_unpartitioned
|--10:SCAN HDFS 3 122.701ms 161.056ms 28.80M 28.80M 534.89 MB 1.88 GB store_sales_unpartitioned
01:SCAN HDFS 3 287.863ms 330.436ms 28.80M 28.80M 534.98 MB 1.88 GB store_sales_unpartitioned
After:
Total Time: 19s139ms
Summary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
------------------------------------------------------------------------------------------------------------------------------
13:AGGREGATE 1 166.241us 166.241us 1 1 28.00 KB -1.00 B FINALIZE
12:EXCHANGE 1 71.695us 71.695us 3 1 0 -1.00 B UNPARTITIONED
11:AGGREGATE 3 2s971ms 3s809ms 3 1 3.08 MB 10.00 MB
00:UNION 3 207.956ms 222.846ms 288.01M 288.01M 0 0
|--02:SCAN HDFS 3 1s533ms 1s535ms 28.80M 28.80M 532.28 MB 1.88 GB store_sales_unpartitioned
|--03:SCAN HDFS 3 1s554ms 1s669ms 28.80M 28.80M 525.73 MB 1.88 GB store_sales_unpartitioned
|--04:SCAN HDFS 3 1s568ms 1s716ms 28.80M 28.80M 525.03 MB 1.88 GB store_sales_unpartitioned
|--05:SCAN HDFS 3 1s503ms 1s617ms 28.80M 28.80M 527.43 MB 1.88 GB store_sales_unpartitioned
|--06:SCAN HDFS 3 1s560ms 1s634ms 28.80M 28.80M 528.52 MB 1.88 GB store_sales_unpartitioned
|--07:SCAN HDFS 3 1s489ms 1s643ms 28.80M 28.80M 534.81 MB 1.88 GB store_sales_unpartitioned
|--08:SCAN HDFS 3 1s534ms 1s581ms 28.80M 28.80M 528.10 MB 1.88 GB store_sales_unpartitioned
|--09:SCAN HDFS 3 1s558ms 1s674ms 28.80M 28.80M 526.77 MB 1.88 GB store_sales_unpartitioned
|--10:SCAN HDFS 3 1s504ms 1s692ms 28.80M 28.80M 527.83 MB 1.88 GB store_sales_unpartitioned
01:SCAN HDFS 3 1s682ms 1s911ms 28.80M 28.80M 526.14 MB 1.88 GB store_sales_unpartitioned
Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416
Reviewed-on: http://gerrit.cloudera.org:8080/5816
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
This patch addresses warning messages from pytest re: the imported
TestMatrix, TestVector, and TestDimension classes, which were being
collected as potential test classes. The fix was to simply prepend
the class names with Impala-
git grep -l 'TestDimension' | xargs \
sed -i 's/TestDimension/ImpalaTestDimension/g'
git grep -l 'TestMatrix' | xargs \
sed -i 's/TestMatrix/ImpalaTestMatrix/g'
git grep -l 'TestVector' | xargs \
sed -i 's/TestVector/ImpalaTestVector/g'
The tests all passed in an exhaustive run on the upstream jenkins
server:
http://jenkins.impala.io:8080/view/Utility/job/pre-review-test/8/
Change-Id: I06b7bc6fd99fbb637a47ba376bf9830705c1fce1
Reviewed-on: http://gerrit.cloudera.org:8080/5794
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
Adds initial support for the functional-query test workload
for Kudu tables.
There are a few issues that make loading the functional
schema difficult on Kudu:
1) Kudu tables must have one or more columns that together
constitute a unique primary key.
a) Primary key columns must currently be the first columns
in the table definition (KUDU-1271).
b) Primary key columns cannot be nullable (KUDU-1570).
2) Kudu tables must be specified with distribution
parameters.
(1) limits the tables that can be loaded without ugly
workarounds. This patch only includes important tables that
are used for relevant tests, most notably the alltypes*
family. In particular, alltypesagg is important but it does
not have a set of columns that are non-nullable and form a unique
primary key. As a result, that table is created in Kudu with
a different name and an additional BIGINT column for a PK
that is a unique index and is generated at data loading time
using the ROW_NUMBER analytic function. A view is then
wrapped around the underlying table that matches the
alltypesagg schema exactly. When KUDU-1570 is resolved, this
can be simplified.
(2) requires some additional considerations and custom
syntax. As a result, the DDL to create the tables is
explicitly specified in CREATE_KUDU sections in the
functional_schema_constraints.csv, and an additional
DEPENDENT_LOAD_KUDU section was added to specify custom data
loading DML that differs from the existing DEPENDENT_LOAD.
TODO: IMPALA-4005: generate_schema_statements.py needs refactoring
Tests that are not relevant or not yet supported have been
marked with xfail and a skip where appropriate.
TODO: Support remaining functional tables/tests when possible.
Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Reviewed-on: http://gerrit.cloudera.org:8080/4175
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.
Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
Adds a query option 'strict_mode' which treats integer and
floating pt overflows as parse errors. In the past,
overflows were ignored and the max value was returned. When
this query option is set, overflowing values are treated as if
they were completely invalid data, i.e. NULL is returned.
When abort_on_error is enabled, this means the query is
aborted.
Notes:
* DECIMAL overflow/underflow is already treated as an error.
* The handling in text-converter treats underflows the same
as overflows, so they would result in the same behavior.
However, floating point parsing never returns an underflow
today.
* We may also want to handle numeric values that are truncated
when parsing to integer types, e.g. 10.5 -> 10.
Change-Id: I7409c31ec0cb6fe0b2d9842b9f58fe1670914836
Reviewed-on: http://gerrit.cloudera.org:8080/3150
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
Previously Impala disallowed LOAD DATA and INSERT on S3. This patch
functionally enables LOAD DATA and INSERT on S3 without making major
changes for the sake of improving performance over S3. This patch also
enables both INSERT and LOAD DATA between file systems.
S3 does not support the rename operation, so the staged files in S3
are copied instead of renamed, which contributes to the slow
performance on S3.
The FinalizeSuccessfulInsert() function now does not make any
underlying assumptions of the filesystem it is on and works across
all supported filesystems. This is done by adding a full URI field to
the base directory for a partition in the TInsertPartitionStatus.
Also, the HdfsOp class now does not assume a single filesystem and
gets connections to the filesystems based on the URI of the file it
is operating on.
Added a python S3 client called 'boto3' to access S3 from the python
tests. A new class called S3Client is introduced which creates
wrappers around the boto3 functions and have the same function
signatures as PyWebHdfsClient by deriving from a base abstract class
BaseFileSystem so that they can be interchangeably through a
'generic_client'. test_load.py is refactored to use this generic
client. The ImpalaTestSuite setup creates a client according to the
TARGET_FILESYSTEM environment variable and assigns it to the
'generic_client'.
P.S: Currently, the test_load.py runs 4x slower on S3 than on
HDFS. Performance needs to be improved in future patches. INSERT
performance is slower than on HDFS too. This is mainly because of an
extra copy that happens between staging and the final location of a
file. However, larger INSERTs come closer to HDFS permformance than
smaller inserts.
ACLs are not taken care of for S3 in this patch. It is something
that still needs to be discussed before implementing.
Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Reviewed-on: http://gerrit.cloudera.org:8080/2574
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
The sorter converts string pointers to block offsets when spilling.
There was a subtle bug in the logic that assumed if the offset was
past the end of the current block, the data must necessarily be in
the next block. This is not true for zero-length strings, because
there is no backing storage so the pointer can point to the byte
after the end of the block.
This patch fixes the bug by using a simpler offset encoding scheme
that packs the block number into the upper 32 bits and the offset
within the block into the lower 32 bits.
It also slightly refactors the functions so that the method signatures
and types are more consistent with the rest of the impala codebase.
Also fix a bug with handling of multiple query options in tests.
Change-Id: I5f64593e94d367d6b6efb61a8b86e35516f18839
Reviewed-on: http://gerrit.cloudera.org:8080/2780
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
We do not have exceptions enabled for codegen'd code, so exceptions
thrown by functions called by codegen'd functions cannot be caught by
the codegen'd functions. TimestampValue::UnixTimeToPtime() has a
try/catch around boost::posix_time::ptime_from_tm(), but since it was
inlined into the TimestampFunctions::FromUnix() IR the try/catch
didn't work. This patch moves the UnixTimeToPtime() implementation to
the .cc file so it doesn't get included in the IR. It does the same
for TimestampParser::Parse() in case it gets inlined into IR code as
well.
Change-Id: Ic0af73629e1e3b6bf18cbf5d832973712b068527
Reviewed-on: http://gerrit.cloudera.org:8080/2210
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Internal Jenkins
Only run the test for parquet (since it doesn't really depend on the
file format) and only execute it serially, since it's very memory hungry.
Change-Id: I0b0ade840ff510fff0fc83532f12b835b3eca76a
Reviewed-on: http://gerrit.cloudera.org:8080/2206
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This is the first step to fix issues with large memory allocations. In
this patch, the built-in `group_concat` is no longer allowed to allocate
arbitraryly large strings and crash impala, but is limited to the upper
bound of possible allocations in Impala.
This patch does not perform any functional change, but rather avoids
unnecessary crashes. However, it changes the parameter type of
FindChunk() in MemPool to be a signed 64bit integer. This change allows
the mempool to allocate internally memory of more than one 1GB, but the
public interface of Allocate() is not changed, so the general limitation
remains. The reason for this change is as follows:
1) In a UDF FunctionContext::Reallocate() would allocate slightly more
than 512MB from the FreePool.
2) The free pool tries to double this size to alloocate 1GB from the
MemPool.
3) The MemPool doubles the size again and overflows the signed 32bit
integer in the FindChunk() method. This will then only allocate 1GB
instead of the expected 2GB.
What happens is that one of the callers expected a larger allocation
than actually happened, which will in turn lead to memory corruption as
soon as the memory is accessed.
Change-Id: I068835dfa0ac8f7538253d9fa5cfc3fb9d352f6a
Reviewed-on: http://gerrit.cloudera.org:8080/858
Tested-by: Internal Jenkins
Reviewed-by: Dan Hecht <dhecht@cloudera.com>