impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Zoram Thanga	b581a9d1ee	IMPALA-6225: Part 2: Query profile date-time strings should have ns precision. This commit follows `16d8dd58`. This patch adds a test case that inspects the thrift profile of a completed query, and verifies that the "Start Time" and "End Time" of the query have nanosecond precision. We chose to work with the thrift profile directly, rather than parse the debug web page, as it is the thrift profile which is consumed by management API clients of Impala. Change-Id: Id3421a34cc029ebca551730084c7cbd402d5c109 Reviewed-on: http://gerrit.cloudera.org:8080/8784 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Impala Public Jenkins	2017-12-21 04:26:33 +00:00
Thomas Tauber-Marshall	b4cf5f2174	IMPALA-6298: Skip test_profile_fragment_instances on local filesystem test_profile_fragment_instances was recently added to verify that the final runtime profile for a query has the expected fragments and exec nodes. The test fails on local filesystem builds, though, as it assumes there will be 3 impalads and therefore 3 fragment instances, but there is only 1 impalad on local filesystem builds. The fix is to disable the test on local filesystem builds. Change-Id: I2c98f160406081626f17709809b8efee9eae1450 Reviewed-on: http://gerrit.cloudera.org:8080/8809 Reviewed-by: Michael Brown <mikeb@cloudera.com> Reviewed-by: Philip Zeyliger <philip@cloudera.com> Tested-by: Impala Public Jenkins	2017-12-11 21:45:43 +00:00
Thomas Tauber-Marshall	f3fa3e017f	IMPALA-6081: Fix test_basic_filters runtime profile failure test_basic_filters has been occasionally failing due to a line missing from a runtime profile for a particular query. The problem is that the query returns all of its results before all of its fragment instances are finished executing (due to a limit). Then, when one fragment instance reports its status, the coordinator returns to it a 'cancelled' status, causing all remaining instances for that backend to be cancelled. Sometimes this cancellation happens quickly enough that the relevant fragment instances have not yet sent a status report when they are cancelled. They will still send a report in finalize, but as the coordinator only updates its runtime profile for 'ok' status reports, not 'cancelled', the final runtime profile doesn't end up with any data for those fragment instances, which means the test does not find the line in the runtime profile its checking for. The fix is to have the coordinator update its runtime profile with every status report it recieves, regardless of error status. Testing: - Ran existing runtime profile tests, which rely on profile output, in a loop. - Manually tested some scenarios with failed queries and checked that the new profile output is reasonable. - Added a new e2e test that runs the affected query and checks for the presence of info for all expected exec node in the profile. This repros the underlying issue consistently. Change-Id: I4f581c7c8039f02a33712515c5bffab942309bba Reviewed-on: http://gerrit.cloudera.org:8080/8754 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins	2017-12-07 21:07:02 +00:00
Bikramjeet Vig	c67b198a19	IMPALA-5784: Separate planner and user set query options in profile This separation will help the user better understand the query runtime profile. Testing: Modified an existing test case. Change-Id: Ibfc7832963fa0bd278a45c06a5a54e1bf40d8876 Reviewed-on: http://gerrit.cloudera.org:8080/7721 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-24 02:42:01 +00:00
Bikramjeet Vig	83bfc142e4	IMPALA-4276: Profile displays non-default query options set by planner Fix to populate the non-default query options set by planner in the runtime profile. Added a corresponding test case. Change-Id: I08e9dc2bebb83101976bbbd903ee48c5068dbaab Reviewed-on: http://gerrit.cloudera.org:8080/7419 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins	2017-07-21 01:14:07 +00:00
Sailesh Mukil	50bd015f2d	IMPALA-5333: Add support for Impala to work with ADLS This patch leverages the AdlFileSystem in Hadoop to allow Impala to talk to the Azure Data Lake Store. This patch has functional changes as well as adds test infrastructure for testing Impala over ADLS. We do not support ACLs on ADLS since the Hadoop ADLS connector does not integrate ADLS ACLs with Hadoop users/groups. For testing, we use the azure-data-lake-store-python client from Microsoft. This client seems to have some consistency issues. For example, a drop table through Impala will delete the files in ADLS, however, listing that directory through the python client immediately after the drop, will still show the files. This behavior is unexpected since ADLS claims to be strongly consistent. Some tests have been skipped due to this limitation with the tag SkipIfADLS.slow_client. Tracked by IMPALA-5335. The azure-data-lake-store-python client also only works on CentOS 6.6 and over, so the python dependencies for Azure will not be downloaded when the TARGET_FILESYSTEM is not "adls". While running ADLS tests, the expectation will be that it runs on a machine that is at least running CentOS 6.6. Note: This is only a test limitation, not a functional one. Clusters with older OSes like CentOS 6.4 will still work with ADLS. Added another dependency to bootstrap_build.sh for the ADLS Python client. Testing: Ran core tests with and without TARGET_FILESYSTEM as 'adls' to make sure that all tests pass and that nothing breaks. Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542 Reviewed-on: http://gerrit.cloudera.org:8080/6910 Tested-by: Impala Public Jenkins Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>	2017-05-25 19:35:24 +00:00
Thomas Tauber-Marshall	ee9fbeca90	IMPALA-5340: Query profile displays stale query state Previously, updates to the query state in ClientRequestState were not immediately reflected in the query profile, potentially leading to the profile showing an incorrect state for an extended perioud during execution. In particular, queries were being shown in the 'CREATED' state long after they had started 'RUNNING'. The fix is to update the profile whenever the state is updated. Testing: - Extended existing hs2 tests and added a beeswax test to check for expected query states in the profile Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3 Reviewed-on: http://gerrit.cloudera.org:8080/6923 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins	2017-05-20 03:17:59 +00:00
Thomas Tauber-Marshall	f195b7577c	IMPALA-5305: test_observability.py failing on s3, localFS and Isilon A test that was recently added, test_observability::test_scan_summary, uses an HBase table. It needs to be restricted not to run on S3, localFS or Isilon. Change-Id: I9863cf3f885eb1d2152186de34e093497af83d99 Reviewed-on: http://gerrit.cloudera.org:8080/6859 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins	2017-05-12 19:34:16 +00:00
Thomas Tauber-Marshall	49b6af54c8	IMPALA-4499: Table name missing from exec summary For scan nodes, previously only HDFS tables showed the name of the table in the 'Detail' section for the scan node. This change adds the table name for all scan node types (Kudu, HBase, and DataSource). Testing: - Added an e2e test in test_observability. Change-Id: If4fd13f893aea4e7df8a2474d7136770660e4324 Reviewed-on: http://gerrit.cloudera.org:8080/6832 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2017-05-10 22:56:38 +00:00
Thomas Tauber-Marshall	7fad3e5dc3	IMPALA-3002/IMPALA-1473: Cardinality observability cleanup IMPALA-3002: The shell prints an incorrect value for '#Rows' in the exec summary for broadcast nodes due to incorrect logic around whether to use max or agg stats. This patch makes the behavior consistent with the way the be treats exec summaries in summary-util.cc. This incorrect logic was also duplicated in the impala_beeswax test framework. IMPALA-1473: When there is a merging exchange with a limit, we may copy rows into the output batch beyond the limit. In this case, we currently update the output batch's size to reflect the limit, but we also need to update ExecNode::num_rows_returned_ or the exec summary may show that the exchange node returned more rows than it really did. Additionally, PlanFragmentExecutor::GetNext does not update rows_produced_counter_ in some cases, leading the runtime profile to display an incorrect value for 'RowsProduced'. Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa Reviewed-on: http://gerrit.cloudera.org:8080/4679 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2016-10-15 01:25:51 +00:00

10 Commits