impala

mirror of https://github.com/apache/impala.git synced 2026-01-28 09:03:52 -05:00

Author	SHA1	Message	Date
norbert.luksa	f65c2a754f	IMPALA-8498: Write column index for floating types when NaN is not present IMPALA-7307 disabled column index writing for floating point columns until PARQUET-1222 is resolved. However, the problematic values are only the NaNs. Therefore we can write column index if NaNs are not present in data. Testing: * Added tests which should fail if a column index is present while having NaN values in the column. Change-Id: Ic9d367500243c8ca142a16ebfeef6c841f013434 Reviewed-on: http://gerrit.cloudera.org:8080/14264 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-10-14 15:21:16 +00:00
Tim Armstrong	3c193c33b2	IMPALA-2138: part 2: clean up result expr handling The main refactoring is to move result expressions into the DataSink implementations, which is where they are used in the backend. This will make it easier to explicitly collect all the expressions in the plan tree for the purposes of projection. Previously the expressions were owned by the PlanFragment and passed into the DataSink. Show result exprs in explain plan of the table sinks at higher verbosity. Change-Id: I163a393b5ce6b8a926b3fee9b4b920e31d6846b2 Reviewed-on: http://gerrit.cloudera.org:8080/14270 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-09-25 01:06:19 +00:00
Fang-Yu Rao	b3b00da1a1	IMPALA-7608: Estimate row count from file size when no stats available Added the feature that computes an estimated number of rows in the current hdfs table if the statistics for the cardinality of the current hdfs table is not available. Also added an additional query option to revert the change in case of regression. Testing: (1) In CardinalityTest.java, replaced the original statement "verifyCardinality("SELECT a FROM functional.tinytable", -1);" in the method testBasicsWithoutStats() with "verifyCardinality("SELECT a FROM functional.tinytable", 2);". (2) In CarginalityTest.java, added more tests to check the cardinality of most PlanNode implementations. For each tested PlanNode, the behaviors before and after we disable the feature are both tested. (3) In set.test, modified three related test cases to make sure that the added query option is included after executing "set all" in various scenarios. (4) There are 8 JUnit tests in PlannerTest.java that would produce different distributed query plans when this feature is enabled. Added an additional JUnit test for 6 of those 8 affected JUnit tests when this feature is enabled. Specifically, each tested query in a newly added test files involves at least one hdfs table without available statistics. We do not add test cases for 2 of the affected JUnit tests when this feature is enabled since it results in flaky tests. These two JUnit tests are testResourceRequirements() and testSpillableBufferSizing(). In this patch we only test them when the feature is disabled. (5) There are 5 Python end to end tests that consist of queries that would produce different results. Added an additional query for each affected query when this feature is disabled. Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a Reviewed-on: http://gerrit.cloudera.org:8080/12974 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-21 03:28:43 +00:00
Zoltan Borok-Nagy	d423979866	IMPALA-5843: Use page index in Parquet files to skip pages This commit implements page filtering based on the Parquet page index. The read and evaluation of the page index is done by the HdfsParquetScanner. At first, we determine the row ranges we are interested in, and based on the row ranges we determine the candidate pages for each column that we are reading. We still issue one ScanRange per column chunk, but we specify sub-ranges that store the candidate pages, i.e. we don't read the whole column chunk, but only fractions of it. Pages are not aligned across column chunks, i.e. page #2 of column A might store completely different rows than page #2 of column B. It means we need to implement some kind of row-skipping logic when we read the data pages. This logic is implemented in BaseScalarColumnReader and ScalarColumnReader. Collection column readers know nothing about page filtering. Page filtering can be turned off by setting the query option 'read_parquet_page_index' to false. Testing: * added some unit tests for the row range and page selection logic * generated various Parquet files with Parquet-MR * enabled Page index writing and wrote selective queries against tables written by Impala. Current tests are likely to use page filtering transparently. Performance: * Measured locally, observed 3x to 20x speedup for selective queries. The speedup was proportional to the IO operations need to be done. * The TPCH benchmark didn't show a significant performance change. It is not a suprise since the data is not being sorted in any useful way. So the main goal was to not introduce perf regression. TODO: * measure performance for remote reads Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Reviewed-on: http://gerrit.cloudera.org:8080/12065 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-05-10 11:46:38 +00:00
Joe McDonnell	c0a6aad28d	IMPALA-8415: Fix tests broken by storage layer information Storage layer information was added to the query profile by IMPALA-6050. This broke some tests on exhaustive and s3 runs due to changes in formatting. This fixes the issues: 1. Replace HDFS SCAN with $FILESYSTEM_NAME SCAN in some test files 2. Add $FILESYSTEM_NAME to partition information string Testing: - Ran exhaustive HDFS tests - Ran s3 tests Change-Id: I11c6ab9c888464a0f0daaf8a7a6f565d25731872 Reviewed-on: http://gerrit.cloudera.org:8080/13025 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-04-16 22:40:13 +00:00
Paul Rogers	360f88e207	IMPALA-8181: Abbreviate row counts in EXPLAIN A recent fix added node cardinality to the standard EXPLAIN output, displaying a large number like 123456780 as 123.46M. This patch applies the same fix to the remaining row count numbers: metadata, extrapolated rows, etc. Tests: * Rebased PlannerTest .test files as needed for the new row count format. * Reran all tests to check for dependencies on the old format. Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c Reviewed-on: http://gerrit.cloudera.org:8080/12438 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-26 23:29:56 +00:00
Paul Rogers	85a8b34645	IMPALA-7905: Hive keywords not quoted for identifiers Impala often generates SQL for statements using the toSql() call. Generated SQL is often used during testing or when writing the query plan. Impala keywords such as "create", when used as identifiers, must be quoted: SELECT `select`, `from` FROM `order` ... The code in ToSqlUtils.getIdentSql() quotes the identifier if it is an Impala or Hive keyword, or if it does not follow the identifier pattern. The code uses the Hive lexer to detect a keyword. But, the code contained a flaw: the lexer expects a case-insensitive input. We provide a case sensitive input. As a result, "MONTH" is caught as a Hive keyword and quoted, but "month" is not. This patch fixes that flaw. This patch also fixes: IMPALA-8051: Compute stats fails on a column with comment character in name The code uses the Hive lexical analyzer to check names. Since "#" and "--" are comment characters, a name like "foo#" is parsed as "foo" which does not need quotes, hence we don't quote "foo#", which causes issues. Added a special check for "#" and "--" to resolve this issue. Testing: * Refactored getIdentSql() easier testing. * Added a tests to the recently added ToSqlUtilsTest for this case and several others. * Making this change caused the columns `month`, `year`, and `key` to be quoted when before they were not. Updated many tests as a result. * Added a new identSql() function, for use in tests, to match the quoting that Impala uses, and to handle the wildcard, and multi-part names. Used this in ToSqlTest to handle the quoted names. * PlannerTest emits statement SQL to the output file wrapped to 80 columns and sometimes leaves trailing spaces at the end of the line. Some tools remove that trailing space, resulting in trivial file differences. Fixed this to remove trailing spaces in order to simplify file comparisons. * Tweaked the "In pipelines" output to avoid trailing spaces when no pipelines are listed. * Reran all FE tests. Change-Id: I06cc20b052a3a66535a171c36b4b31477c0ba6d0 Reviewed-on: http://gerrit.cloudera.org:8080/12009 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-24 01:26:08 +00:00
Csaba Ringhofer	0906e0817c	IMPALA-7889: Write new logical types in Parquet Fill the LogicalType field in Parquet schemas for columns that have an associated logical type. ConvertedType still has to be filled to remain compatible with older readers. Testing: - added new tests to check both logical and converted types to test_insert_parquet.py Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91 Reviewed-on: http://gerrit.cloudera.org:8080/12004 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-17 02:48:39 +00:00
paul-rogers	a7ea86b768	IMPALA-8021: Add estimated cardinality to EXPLAIN output Cardinality is vital to understanding why a plan has the form it does, yet the planner normally emits cardinality information only for the detailed levels. Unfortunately, most query profiles we see are at the standard level without this information (except in the summary table), making it hard to understand what happened. This patch adds cardinality to the standard EXPLAIN output. It also changes the displayed cardinality value to be in abbreviated "metric" form: 1.23K instead of 1234, etc. Changing the DESCRIBE output has a huge impact on PlannerTest: all the "golden" test files must change. To avoid doing this twice, this patch also includes: IMPALA-7919: Add predicates line in plan output for partition key predicates This is also the time to also include: IMPALA-8022: Add cardinality checks to PlannerTest The comparison code was changed to allow a set of validators, one of which compares cardinality to ensure it is within 5% of the expected value. This should ensure we don't change estimates unintentionally. While many planner tests are concerned with cardinality, many others are not. Testing showed that the cardinality is actually unstable within tests. For such tests, added filters to ignore cardinality. The filter is enabled by default (for backward compatibility) but disabled (to allow cardinality verification) for the critical tests. Rebasing the tests was complicated by a bug in the error-matching code, so this patch also fixes: IMPALA-8023: Fix PlannerTest to handle error lines consistently Now, the error output written to the output "save results" file matches that expected in the "golden" file -- no more handling these specially. Testing: * Added cardinality verification. * Reran all FE tests. * Rebased all PlannerTest .test files. * Adjusted the metadata/test_explain.py test to handle the changed EXPLAIN output. Change-Id: Ie9aa2d715b04cbb279aaffec8c5692686562d986 Reviewed-on: http://gerrit.cloudera.org:8080/12136 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-12 04:03:26 +00:00
Andrew Sherman	d3948d9a01	IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 \| Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK \| mem-estimate=0B mem-reservation=0B thread-reservation=0 \| 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 60 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 Reviewed-on: http://gerrit.cloudera.org:8080/11719 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>	2018-11-15 21:32:07 +00:00
Tim Armstrong	85166afa8a	IMPALA-6374: fix handling of commas in .test files The .test file parser implemented an unconventional method for parsing single-quoted strings in comma-separated value format. This didn't handle trailing commas in the string correctly. This commit switches to using a conventional method for parsing comma-separated value format: * Commas enclosed by single quotes are not treated as field separators * Single quotes can be escaped within a string by doubling them. I looked into using Python's .csv module for this, but it wouldn't work without modifying the test file format more because it automatically discards the quotes during parsing, which are actually semantically important in .test files. E.g. without the quotes we can't distinguish between the literal string 'regex:...' and the regex regex:.... Testing: Ran exhaustive tests and fixed .test files that required modifications. Will rerun before merging. Added a couple of tests to exercise edge cases in the test file parser. Change-Id: I18ddcb0440490ddf8184be66d3681038a1615dd9 Reviewed-on: http://gerrit.cloudera.org:8080/11800 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2018-10-30 22:17:49 +00:00
Zoltan Borok-Nagy	de7f09d726	IMPALA-7644: Hide Parquet page index writing with feature flag This commit adds the command line flag enable_parquet_page_index_writing to the Impala daemon that switches Impala's ability of writing the Parquet page index. By default the flag is false, i.e. Impala doesn't write the page index. This flag is only temporary, we plan to remove it once Impala is able to read the page index and has better testing around it. Because of this change I had to move test_parquet_page_index.py to the custom_cluster test suite since I need to set this command line flag in order to test the functionality. I also merged most of the test cases because we don't want to restart the cluster too many times. I removed 'num_data_pages_' from BaseColumnWriter since it was rather confusing and didn't provide any measurable performance improvement. This commit fixes the ASAN error produced by the first IMPALA-7644 commit which was reverted later. Change-Id: Ib4a9098a2085a385351477c715ae245d83bf1c72 Reviewed-on: http://gerrit.cloudera.org:8080/11694 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-17 19:57:17 +00:00
Joe McDonnell	af76186e01	IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature flag" The fix for IMPALA-7644 introduced ASAN issues detailed in IMPALA-7704. Reverting for now. This reverts commit `843683ed6c`. Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a Reviewed-on: http://gerrit.cloudera.org:8080/11671 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Michael Ho <kwho@cloudera.com>	2018-10-13 03:26:03 +00:00
Zoltan Borok-Nagy	843683ed6c	IMPALA-7644: Hide Parquet page index writing with feature flag This commit adds the command line flag enable_parquet_page_index_writing to the Impala daemon that switches Impala's ability of writing the Parquet page index. By default the flag is false, i.e. Impala doesn't write the page index. This flag is only temporary, we plan to remove it once Impala is able to read the page index and has better testing around it. Because of this change I had to move test_parquet_page_index.py to the custom_cluset test suite since I need to set this command line flag in order to test the functionality. I also merged most of the test cases because we don't want to restart the cluster too many times. Change-Id: If9994882aa59cbaf3ae464100caa8211598287bc Reviewed-on: http://gerrit.cloudera.org:8080/11563 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-08 13:27:49 +00:00
Tim Armstrong	b7d509d761	IMPALA-7231: group plan nodes into pipelines This adds some informational output to explain plans and sends the information to the backend. The idea is that this will make it easier to explain how Impala's pipelined execution works and also enable future work on profile analysis that can more intelligently group plan nodes. Tests: * Updated planner tests to include new output. Change-Id: I1d10eb14d997242f445e5c5fc5362d5410370721 Reviewed-on: http://gerrit.cloudera.org:8080/10848 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-08-10 19:31:30 +00:00
Tim Armstrong	a8d7a50bd5	IMPALA-7354: planner test resource estimates for more workloads Adds the resource estimates for key benchmark workloads: TPC-H, TPC-DS, TPC-H Nested and TPC-H Kudu to the planner test so that we can track changes in resource requirements and estimates for these queries. Also don't show decimal places for MB and KB estimates. The estimates are not accurate to that level and displaying extra precision has some disadvantages: * It communicates to readers that the estimates have a high level of precision. * It increases the odds of small variations in file sizes, etc causing test failures. Also fixed a regex in the stress test that didn't escape the decimal point correctly. Testing: Ran core tests. Change-Id: I6a9f836699200ea87fb03bf36abad0e23949ac26 Reviewed-on: http://gerrit.cloudera.org:8080/11087 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-08-07 08:34:33 +00:00
Zoltan Borok-Nagy	ccf19f9f8f	IMPALA-5842: Write page index in Parquet files This commit builds on the previous work of Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/ The commit implements the write path of PARQUET-922: "Add column indexes to parquet.thrift". As specified in the parquet-format, Impala writes the page indexes just before the footer. This allows much more efficient page filtering than using the same information from the 'statistics' field of DataPageHeader. I updated Pooja's python tests as well. Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9 Reviewed-on: http://gerrit.cloudera.org:8080/9693 Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-05-17 20:22:02 +00:00
Tim Armstrong	fb5dc9eb48	IMPALA-4835: switch I/O buffers to buffer pool This is the following squashed patches that were reverted. I will fix the known issues with some follow-on patches. ====================================================================== IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify the interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight ====================================================================== IMPALA-4835: Part 2: Allocate scan range buffers upfront This change is a step towards reserving memory for buffers from the buffer pool and constraining per-scanner memory requirements. This change restructures the DiskIoMgr code so that each ScanRange operates with a fixed set of buffers that are allocated upfront and recycled as the I/O mgr works through the ScanRange. One major change is that ScanRanges get blocked when a buffer is not available and get unblocked when a client returns a buffer via ReturnBuffer(). I was able to remove the logic to maintain the blocked_ranges_ list by instead adding a separate set with all ranges that are active. There is also some miscellaneous cleanup included - e.g. reducing the amount of code devoted to maintaining counters and metrics. One tricky part of the existing code was the it called IssueInitialRanges() with empty lists of files and depended on DiskIoMgr::AddScanRanges() to not check for cancellation in that case. See IMPALA-6564/IMPALA-6588. I changed the logic to not try to issue ranges for empty lists of files. I plan to merge this along with the actual buffer pool switch, but separated it out to allow review of the DiskIoMgr changes separate from other aspects of the buffer pool switchover. Testing: * Ran core and exhaustive tests. ====================================================================== IMPALA-4835: Part 3: switch I/O buffers to buffer pool This is the final patch to switch the Disk I/O manager to allocate all buffer from the buffer pool and to reserve the buffers required for a query upfront. * The planner reserves enough memory to run a single scanner per scan node. * The multi-threaded scan node must increase reservation before spinning up more threads. * The scanner implementations must be careful to stay within their assigned reservation. The row-oriented scanners were most straightforward, since they only have a single scan range active at a time. A single I/O buffer is sufficient to scan the whole file but more I/O buffers can improve I/O throughput. Parquet is more complex because it issues a scan range per column and the sizes of the columns on disk are not known during planning. To deal with this, the reservation in the frontend is based on a heuristic involving the file size and # columns. The Parquet scanner can then divvy up reservation to columns based on the size of column data on disk. I adjusted how the 'mem_limit' is divided between buffer pool and non buffer pool memory for low mem_limits to account for the increase in buffer pool memory. Testing: * Added more planner tests to cover reservation calcs for scan node. * Test scanners for all file formats with the reservation denial debug action, to test behaviour when the scanners hit reservation limits. * Updated memory and buffer pool limits for tests. * Added unit tests for dividing reservation between columns in parquet, since the algorithm is non-trivial. Perf: I ran TPC-H and targeted perf locally comparing with master. Both showed small improvements of a few percent and no regressions of note. Cluster perf tests showed no significant change. Change-Id: I3ef471dc0746f0ab93b572c34024fc7343161f00 Reviewed-on: http://gerrit.cloudera.org:8080/9679 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2018-04-28 23:41:39 +00:00
Tim Armstrong	d879fa9930	IMPALA-6905: support regexes with more verifiers Support row_regex and other lines for the subset and superset verifiers, which previously assumed that lines in the actual and expected had to match exactly. Use in test_stats_extrapolation to make the test more robust to irrelevant changes in the explain plan. Testing: Manually modified a superset and a subset test to check that tests fail as expected. Change-Id: Ia7a28d421c8e7cd84b14d07fcb71b76449156409 Reviewed-on: http://gerrit.cloudera.org:8080/10155 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-04-26 00:56:36 +00:00
Tim Armstrong	161cbe30ff	Revert IMPALA-4835 and dependent changes Revert "IMPALA-6585: increase test_low_mem_limit_q21 limit" This reverts commit `25bcb258df`. Revert "IMPALA-6588: don't add empty list of ranges in text scan" This reverts commit `d57fbec6f6`. Revert "IMPALA-4835: Part 3: switch I/O buffers to buffer pool" This reverts commit `24b4ed0b29`. Revert "IMPALA-4835: Part 2: Allocate scan range buffers upfront" This reverts commit `5699b59d0c`. Revert "IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation" This reverts commit `65680dc421`. Change-Id: Ie5ca451cd96602886b0a8ecaa846957df0269cbb Reviewed-on: http://gerrit.cloudera.org:8080/9480 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-03 04:22:12 +00:00
Tim Armstrong	24b4ed0b29	IMPALA-4835: Part 3: switch I/O buffers to buffer pool This is the final patch to switch the Disk I/O manager to allocate all buffer from the buffer pool and to reserve the buffers required for a query upfront. * The planner reserves enough memory to run a single scanner per scan node. * The multi-threaded scan node must increase reservation before spinning up more threads. * The scanner implementations must be careful to stay within their assigned reservation. The row-oriented scanners were most straightforward, since they only have a single scan range active at a time. A single I/O buffer is sufficient to scan the whole file but more I/O buffers can improve I/O throughput. Parquet is more complex because it issues a scan range per column and the sizes of the columns on disk are not known during planning. To deal with this, the reservation in the frontend is based on a heuristic involving the file size and # columns. The Parquet scanner can then divvy up reservation to columns based on the size of column data on disk. I adjusted how the 'mem_limit' is divided between buffer pool and non buffer pool memory for low mem_limits to account for the increase in buffer pool memory. Testing: * Added more planner tests to cover reservation calcs for scan node. * Test scanners for all file formats with the reservation denial debug action, to test behaviour when the scanners hit reservation limits. * Updated memory and buffer pool limits for tests. * Added unit tests for dividing reservation between columns in parquet, since the algorithm is non-trivial. Perf: I ran TPC-H and targeted perf locally comparing with master. Both showed small improvements of a few percent and no regressions of note. Cluster perf tests showed no significant change. Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 Reviewed-on: http://gerrit.cloudera.org:8080/8966 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2018-02-23 04:17:41 +00:00
Alex Behm	1a1927b07d	IMPALA-6228: Control stats extrapolation via tbl prop. Introduces a new TBLPROPERTY for controlling stats extrapolation on a per-table basis: impala.enable.stats.extrapolation=true/false The property key was chosen to be consistent with the impalad startup flag --enable_stats_extrapolation and to indicate that the property was set and is used by Impala. Behavior: - If the property is not set, then the extrapolation behavior is determined by the impalad startup flag. - If the property is set, it overrides the impalad startup flag, i.e., extrapolation can be explicitly enabled or disabled regardless of the startup flag. Testing: - added new unit tests - code/hdfs run passed Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd Reviewed-on: http://gerrit.cloudera.org:8080/9139 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2018-02-03 22:56:13 +00:00
Lars Volker	fc529b7f9f	IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduced the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Reviewed-on: http://gerrit.cloudera.org:8080/9153 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins	2018-02-03 05:58:50 +00:00
Alex Behm	b3d8a507cb	IMPALA-5310: Add COMPUTE STATS TABLESAMPLE. Adds the TABLESAMPLE clause for COMPUTE STATS. Syntax: COMPUTE STATS <table> TABLESAMPLE SYSTEM(<number>) [REPEATABLE(<number>)] Computes and replaces the table-level row count and total file size, as well as all table-level column statistics. Existing partition-level row counts are not modified. The TABLESAMPLE clause can be used to limit the scanned data volume to a desired percentage. When sampling, the unmodified results of the COMPUTE STATS queries are sent to the CatalogServer. There, the stats are extrapolated before storing them into the HMS so as not to confuse other engines like Hive/SparkSQL which may rely on the shared HMS fields being accurate. Limitations - Only works for HDFS tables - TABLESAMPLE is not supported for COMPUTE INCREMENTAL STATS - TABLESAMPLE requires --enable_stats_extrapolation=true Changes to EXPLAIN The stored statistics from the HMS are more clearly displayed under a 'stored statistics' section. Example: 00:SCAN HDFS [functional.alltypes, RANDOM] partitions=24/24 files=24 size=478.45KB stored statistics: table: rows=7300 size=478.45KB partitions: 24/24 rows=7300 columns: all Testing: - added new functional tests - core/hdfs run passed Change-Id: I7f3e72471ac563adada4a4156033a85852b7c8b7 Reviewed-on: http://gerrit.cloudera.org:8080/8136 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2017-11-29 22:37:01 +00:00
Matthew Jacobs	6c12546561	IMPALA-4833: Compute precise per-host reservation size Before this change, the per-host reservation size was computed by the Planner. However, scheduling happens after planning, so the Planner must assume that all fragments run on all hosts, and the reservation size is likely much larger than it needs to be. This moves the computation of the per-host reservation size to the BE where it can be computed more precisely. This also includes a number of plan/profile changes. Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac Reviewed-on: http://gerrit.cloudera.org:8080/7630 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-12 08:10:07 +00:00
Tim Armstrong	64fd0115e5	IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Reviewed-on: http://gerrit.cloudera.org:8080/7223 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-07-12 01:17:24 +00:00
Tim Armstrong	c4d284f3cc	IMPALA-5483: Automatically disable codegen for small queries This is similar to the single-node execution optimisation, but applies to slightly larger queries that should run in a distributed manner but won't benefit from codegen. This adds a new query option disable_codegen_rows_threshold that defaults to 50,000. If fewer than this number of rows are processed by a plan node per impalad, the cost of codegen almost certainly outweighs the benefit. Using rows processed as a threshold is justified by a simple model that assumes the cost of codegen and execution per row for the same operation are proportional. E.g. if x is the complexity of the operation, n is the number of rows processed, C is a constant factor giving the cost of codegen and Ec/Ei are constant factor giving the cost of codegen'd and interpreted execution and d, then the cost of the codegen'd operator is C * x + Ec * x * n and the cost of the interpreted operator is Ei * x * n. Rearranging means that interpretation is cheaper if n < C / (Ei - Ec), i.e. that (at least with the simplified model) it makes sense to choose interpretation or codegen based on a constant threshold. The model also implies that it is somewhat safer to choose codegen because the additional cost of codegen is O(1) but the additional cost of interpretation is O(n). I ran some experiments with TPC-H Q1, varying the input table size, to determine what the cut-over point where codegen was beneficial was. The cutover was around 150k rows per node for both text and parquet. At 50k rows per node disabling codegen was very beneficial - around 0.12s versus 0.24s. To be somewhat conservative I set the default threshold to 50k rows. On more complex queries, e.g. TPC-H Q10, the cutover tends to be higher because there are plan nodes that process many fewer than the max rows. Fix a couple of minor issues in the frontend - the numNodes_ calculation could return 0 for Kudu, and the single node optimization didn't handle the case where for a scan node with conjuncts, a limit and missing stats correctly (it considered the estimate still valid.) Testing: Updated e2e tests that set disable_codegen to set disable_codegen_rows_threshold to 0, so that those tests run both with and without codegen still. Added an e2e test to make sure that the optimisation is applied in the backend. Added planner tests for various cases where codegen should and shouldn't be disabled. Perf: Added a targeted perf test for a join+agg over a small input, which benefits from this change. Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e Reviewed-on: http://gerrit.cloudera.org:8080/7153 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-06-29 21:14:59 +00:00
Alex Behm	e89d7057a6	IMPALA-2373: Extrapolate row counts for HDFS tables. The main idea of this patch is to use table stats to extrapolate the row counts for new/modified partitions. Existing behavior: - Partitions that lack the row count stat are ignored when estimating the cardinality of HDFS scans. Such partitions effectively have an estimated row count of zero. - We always use the row count stats for partitions that have one. The row count may be innaccurate if data in such partitions has changed significantly. Summary of changes: - Enhance COMPUTE STATS to also store the total number of file bytes in the table. - Use the table-level row count and file bytes stats to estimate the number of rows in a scan. - A new impalad startup flag is added to enable/disable the extrapolation behavior. The feature is disabled by default. Note that even with the feature disabled, COMPUTE STATS stores the file bytes so you can enable the feature without having to run COMPUTE STATS again. Testing: - Added new FE unit test - Added new EE test Change-Id: I972c8a03ed70211734631a7dc9085cb33622ebc4 Reviewed-on: http://gerrit.cloudera.org:8080/6840 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2017-05-26 21:06:17 +00:00

28 Commits