impala

mirror of https://github.com/apache/impala.git synced 2025-12-22 03:18:15 -05:00

Author	SHA1	Message	Date
Attila Jeges	c8aa5796d9	IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Reviewed-on: http://gerrit.cloudera.org:8080/17806 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Attila Jeges <attilaj@cloudera.com>	2021-09-02 21:34:41 +00:00
Csaba Ringhofer	c65d7861d9	IMPALA-10656: Fire insert events before commit Before this fix Impala committed an insert first, then reloaded the table from HMS, and generated the insert events based on the difference between the two snapshots. (e.g. which file was not present in the old snapshot but are there in the new one). Hive replication expects the insert events before the commit, so this may potentially lead to issues there. The solution is to collect the new files during the insert in the backend, and send the insert events based on this file set. This wasn't very hard to do as we were already collecting the files in some cases: - to move them from staging dir to their final location in case of non-partitioned tables - to write the file list to snapshot files in case of Iceberg tables This patch unifies the paths above and collects all information about the created files regardless of the table type. Testing: - no new tests, insert events were already covered in test_event_processing.py and MetastoreEventsProcessorTest.java - ran core tests Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Reviewed-on: http://gerrit.cloudera.org:8080/17313 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-27 00:41:05 +00:00
wzhou-code	b5e2a0ce2e	IMPALA-9224: Blacklist nodes with faulty disk for spilling This patch extends blacklist functionality by adding executor node to blacklist if a query fails caused by disk failure during spill-to-disk. Also classifies disk error codes and defines a blacklistable error set for non-transient disk errors. Coordinator blacklists executor only if the executor hitted blacklistable error during spill-to-disk. Adds a new debug action to simulate disk write error during spill-to- disk. To use, specify in query options as: 'debug_action': 'IMPALA_TMP_FILE_WRITE:<hostname>:<port>:<action>' where <hostname> and <port> represent the impalad which execute the fragment instances, <port> is the BE krpc port (default 27000). Adds new test cases for blacklist and query-retry to cover the code changes. Testing: - Passed new test cases. - Passed exhaustive test. - Manually simulated disk failures in scratch directories on nodes of a cluster, verified that the nodes were blacklisted as expected. Change-Id: I04bfcb7f2e0b1ef24a5b4350f270feecd8c47437 Reviewed-on: http://gerrit.cloudera.org:8080/16949 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-02-04 05:12:42 +00:00
Fucun Chu	4099a60689	IMPALA-10317: Add query option that limits huge joins at runtime This patch adds support for limiting the rows produced by a join node such that runaway join queries can be prevented. The limit is specified by a query option. Queries exceeding that limit get terminated. The checking runs periodically, so the actual rows produced may go somewhat over the limit. JOIN_ROWS_PRODUCED_LIMIT is exposed as an advanced query option. Rows produced Query profile is updated to include query wide and per backend metrics for RowsReturned. Example from " set JOIN_ROWS_PRODUCED_LIMIT = 10000000; select count() from tpch_parquet.lineitem l1 cross join (select from tpch_parquet.lineitem l2 limit 5) l3;": NESTED_LOOP_JOIN_NODE (id=2): - InactiveTotalTime: 107.534ms - PeakMemoryUsage: 16.00 KB (16384) - ProbeRows: 1.02K (1024) - ProbeTime: 0.000ns - RowsReturned: 10.00M (10002025) - RowsReturnedRate: 749.58 K/sec - TotalTime: 13s337ms Testing: Added tests for JOIN_ROWS_PRODUCED_LIMIT Change-Id: Idbca7e053b61b4e31b066edcfb3b0398fa859d02 Reviewed-on: http://gerrit.cloudera.org:8080/16706 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-12-22 06:10:39 +00:00
Tim Armstrong	9429bd779d	IMPALA-9382: part 2/3: aggregate profiles sent to coordinator This reworks the status reporting so that serialized AggregatedRuntimeProfile objects are sent from executors to coordinators. These profiles are substantially denser and faster to process for higher mt_dop values. The aggregation is also done in a single step, merging the aggregated thrift profile from the executor directly into the final aggregated profile, instead of converting it to an unaggregated profile first. The changes required were: * A new Update() method for AggregatedRuntimeProfile that updates the profile from a serialised AggregateRuntimeProfile for a subset of the instances. The code is generalized from the existing InitFromThrift() code path. * Per-fragment reports included in the status report protobuf when --gen_experimental_profile=true. * Logic on the coordinator that either consumes serialized AggregatedRuntimeProfile per fragment, when --gen_experimental_profile=true, or consumes a serialized RuntimeProfile per finstance otherwise. This also adds support for event sequences and time series in the aggregated profile, so the amount of information in the aggregated profile is now on par with the basic profile. We also finish off support for JSON profile. The JSON profile is more stripped down because we do not need to round-trip profiles via JSON and it is a much less dense profile representation. Part 3 will clean up and improve the display of the profile. Testing: * Add sanity tests for aggregated runtime profile. * Add unit tests to exercise aggregation of the various counter types * Ran core tests. Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca Reviewed-on: http://gerrit.cloudera.org:8080/16057 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-11-26 06:50:41 +00:00
wzhou-code	1af60a1560	IMPALA-9180 (part 3): Remove legacy backend port The legacy Thrift based Impala internal service has been removed so the backend port 22000 can be freed up. This patch set flag be_port as a REMOVED_FLAG and all infrastructures around it are cleaned up. StatestoreSubscriber::subscriber_id is set as hostname + krpc_port. Testing: - Passed the exhaustive test. Change-Id: Ic6909a8da449b4d25ee98037b3eb459af4850dc6 Reviewed-on: http://gerrit.cloudera.org:8080/16533 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-11-03 00:56:26 +00:00
Zoltan Borok-Nagy	981ef10465	IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet) This commit adds support for INSERT INTO statements against Iceberg tables when the table is non-partitioned and the underlying file format is Parquet. We still use Impala's HdfsParquetTableWriter to write the data files, though they needed some modifications to conform to the Iceberg spec, namely: * write Iceberg/Parquet 'field_id' for the columns * TIMESTAMPs are encoded as INT64 micros (without time zone) We use DmlExecState to transfer information from the table sink operators to the coordinator, then updateCatalog() invokes the AppendFiles API to add files atomically. DmlExecState is encoded in protobuf, communication with the Frontend uses Thrift. Therefore to avoid defining Iceberg DataFile multiple times they are stored in FlatBuffers. The commit also does some corrections on Impala type <-> Iceberg type mapping: * Impala TIMESTAMP is Iceberg TIMESTAMP (without time zone) * Impala CHAR is Iceberg FIXED Testing: * Added INSERT tests to iceberg-insert.test * Added negative tests to iceberg-negative.test * I also did some manual testing with Spark. Spark is able to read Iceberg tables written by Impala until we use TIMESTAMPs. In that case Spark rejects the data files because it only accepts TIMESTAMPS with time zone. * Added concurrent INSERT tests to test_insert_stress.py Change-Id: I5690fb6c2cc51f0033fa26caf8597c80a11bcd8e Reviewed-on: http://gerrit.cloudera.org:8080/16545 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-10-26 20:01:09 +00:00
wzhou-code	01316ad1c9	IMPALA-10154: Fix data race on coord_backend_id in TSAN build This issue was introduced by the patch for IMPALA-5746. QueryState::exec_rpc_params_.coord_backend_id is set in function QuestState::Init(), but it could be accessed by QueryExecMgr object in QueryExecMgr::CancelQueriesForFailedCoordinators() before or during QueryState::Init() is called, hence cause data race. To fix it, move coord_backend_id from class ExecQueryFInstancesRequestPB to class TQueryCtx. QueryState::query_ctx_ is a constant variable and is set in QueryState c'tor so that QueryState::query_ctx_.coord_backend_id is valid and will not be changed once the QuestState object is created. Testing: - Passed tests/custom_cluster/test_process_failures.py. - Passed the core tests for normal build. - Passed the core tests against a TSAN build. Change-Id: I1c4b51e741a28b80bf3485adff8c97aabe0a3f67 Reviewed-on: http://gerrit.cloudera.org:8080/16437 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-09-14 21:50:37 +00:00
wzhou-code	9d43cfdaee	IMPALA-5746: Cancel all queries scheduled by failed coordinators Executor registers the updating of cluster membership. When coordinators are absence from the active cluster membership list, executer cancels all the running fragments of the queries which are scheduled by the inactive coordinators since the executer cannot send results back to the inactive/failed coordinators. This makes executers quickly release the resources allocated for those running fragments to be cancelled. Testing: - Added new test case TestProcessFailures::test_kill_coordinator and ran the test case as following command: ./bin/impala-py.test tests/custom_cluster/test_process_failures.py\ ::TestProcessFailures::test_kill_coordinator \ --exploration_strategy=exhaustive. - Passed the core test. Change-Id: I918fcc27649d5d2bbe8b6ef47fbd9810ae5f57bd Reviewed-on: http://gerrit.cloudera.org:8080/16215 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-07-31 21:39:08 +00:00
Tim Armstrong	67b4764853	IMPALA-9752: aggregate profile stats on executor Before this change the coordinator depended on getting the full fragment instance profiles from executors to pull out various things. This change removes that dependency by pulling out the information on the executor, and including it in the status report protobuf. This should slightly reduce the amount of work done on the coordinator, but more importantly, makes it easier to switch to sending aggregated profiles from executor to coordinator, because the coordinator no longer depends on receiving individual instance profiles. Per-host peak memory is included directly in the status report. Per-backend stats - where the per-backend total is needed - are aggregated on the executor with the result included in the status report. These counters are: BytesRead, ScanRangesComplete, TotalBytesSent, TotalThreads{User,Sys}Time. One subtlety to keep in mind that status reports don't include stats for instances where the final update was sent in a previous status report. So the executor needs to ensure that stats for finished fragment instances are included in updates. This is achieved by caching those values in FragmentInstanceState. The stats used in the exec summary were previously also plucked out of the profile on the coordinator. This change moves the work to the executor, and includes the per-node stats in the status report. I did a little cleanup of the profile counter declarations, making sure they were consistently inside the impala namespace in the files that I touched. Testing: Ran core tests. Manually checked exec summary, query profiles and backends page for a running query. Change-Id: Ia2aca354d803ce78a798a1a64f9f98353b813e4a Reviewed-on: http://gerrit.cloudera.org:8080/16050 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-06-12 04:01:37 +00:00
Thomas Tauber-Marshall	b67c0906f5	IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf The new admission control service will be written in protobuf, so there are various admission control related structures currently stored in Thrift that it would be convenient to convert to protobuf, to minimize the amount of converting back and forth that needs to be done. This patch converts some portions of TExecPlanFragmentInfo to protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec() rpc, so the refactored parts are now just directly included in the ExecQueryFInstancesRequestPB. The portions that are converted are those that are part of the QuerySchedule, in particular the TPlanFragmentDestination, TScanRangeParams, and TJoinBuildInput. This patch is just a refactor and doesn't contain any functional changes. One notable related change is that DataSink::CreateSink() has two parameters removed - TPlanFragmentCtx (which no longer exists) and TPlanFragmentInstanceCtx. These variables and the new PB eqivalents are available via the RuntimeState that was already being passed in as another parameter and don't need to be individually passed in. Testing: - Passed a full run of existing tests. - Ran the single node perf test and didn't detect any regressions. Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b Reviewed-on: http://gerrit.cloudera.org:8080/15844 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-05-29 01:26:58 +00:00
Sahil Takiar	7f7743dcc6	IMPALA-9296: Move AuxErrorInfo to StatefulStatus This patch moves AuxErrorInfoPB from FragmentInstanceExecStatusPB to StatefulStatusPB. This is necessary because if the report with the AuxErrorInfoPB is dropped (e.g. due to backpressure at the Coordinator or a flaky network), the next report won't contain the AuxErrorInfoPB, and the error info will be lost. StatefulStatus solves this by detecting any reports that may not have been received by the Coordinator, and re-transmitting any StatefulStatuses that were not successfully delivered. This change also makes the setting of AuxErrorInfoPB stateful, so that the error info only shows up in one report and is then dropped from the RuntimeState. Change-Id: Iabbb48dd3ab58ba7b76b1ab6979b92d0e25e72e3 Reviewed-on: http://gerrit.cloudera.org:8080/15046 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-01-18 02:05:45 +00:00
Sahil Takiar	8a4fececcf	IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails Introduces a new optional field to FragmentInstanceExecStatusPB: AuxErrorInfoPB. AuxErrorInfoPB contains optional metadata associated with a failed fragment instance. Currently, AuxErrorInfoPB only contains one field: RPCErrorInfoPB, which is only set if the fragment failed because a RPC to another impalad failed. The RPCErrorInfoPB contains the destination node of the failed RPC and the posix error code of the failed RPC. Coordinator::UpdateBackendExecStatus(ReportExecStatusRequestPB, ...) uses the information in RPCErrorInfoPB (if one is set) to blacklist the target node. While RPCErrorInfoPB::dest_node can be set to the address of the Coordinator, the Coordinator will not blacklist itself. The Coordinator only blacklists the node if the RPC failed with a specific error code (currently either ENOTCONN, ECONNREFUSED, ESHUTDOWN). Testing: * Ran core tests * Added new test to test_blacklist.py Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Reviewed-on: http://gerrit.cloudera.org:8080/14677 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-12-20 02:50:46 +00:00
Thomas Tauber-Marshall	a1588e4498	IMPALA-9181: Serialize TQueryCtx once per query When issuing Exec() rpcs to backends, we currently serialize the TQueryCtx once per backend. This is inefficient as the TQueryCtx is the same for all backends and really only needs to be serialized once. Serializing the TQueryCtx can be expensive as it contains both the full text of the original query and the descriptor table, which can be quite large. In a synthetic dataset I tested with, scanning a table with 100k partitions leads to a descriptor table size of ~20MB. This patch serializes the TQueryCtx in the coordinator and then passes it to each BackendState when calling Exec(). Followup work might consider if we really need all of the info in the TQueryCtx to be distributed to all backends. Testing: - Passed full run of existing tests. - Single node perf run showed no significant change. Change-Id: I6a4dd302fd5602ec2775492a041ddd51e7d7a6c6 Reviewed-on: http://gerrit.cloudera.org:8080/14777 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-12-03 23:26:25 +00:00
Thomas Tauber-Marshall	30c3cd95a4	IMPALA-7467: Port ExecQueryFInstances to krpc This patch ports the ExecQueryFInstances rpc to use KRPC. The parameters for this call contain a huge number of Thrift structs (eg. everything related to TPlanNode and TExpr), so to avoid converting all of these to protobuf and the resulting effect that would have on the FE and catalog, this patch stores most of the parameters in a sidecar (in particular the TQueryCtx, TPlanFragmentCtx's, and TPlanFragmentInstanceCtx's). Testing: - Passed a full exhaustive run on the minicluster. Set up a ten node cluster with tpch 500: - Ran perf tests: 3 iterations per tpch query, 4 concurrent streams, no perf change. - Ran the stress test for 1000 queries, passed. Change-Id: Id3f1c6099109bd8e5361188005a7d0e892147570 Reviewed-on: http://gerrit.cloudera.org:8080/13583 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-19 07:21:01 +00:00
Andrew Sherman	adde66b37c	IMPALA-7985: Port RemoteShutdown() to KRPC. The :shutdown command is used to shutdown a remote server. The common case is that a user specifies the impalad to shutdown by specifying a host e.g. :shutdown('host100'). If a user has more than one impalad on a remote host then the form :shutdown('<host>:<port>') can be used to specify the port by which the impalad can be contacted. Prior to IMPALA-7985 this port was the backend port, e.g. :shutdown('host100:22000'). With IMPALA-7985 the port to use is the KRPC port, e.g. :shutdown('host100:27000'). Shutdown is implemented by making an rpc call to the target impalad. This changes the implementation of this call to use KRPC. To aid the user in finding the KRPC port, the KRPC address is added to the /backends section of the debug web page. We attempt to detect the case where :shutdown is pointed at a thrift port (like the backend port) and print an informative message. Documentation of this change will be done in IMPALA-8098. Further improvements to DoRpcWithRetry() will be done in IMPALA-8143. For discussion of why it was chosen to implement this change in an incompatible way, see comments in https://issues.apache.org/jira/browse/IMPALA-7985. TESTING Ran all end-to-end tests. Enhance the test for /backends in test_web_pages.py. In test_restart_services.py add a call to the old backend port to the test. Some expected error messages were changed in line with what KRPC returns. Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e Reviewed-on: http://gerrit.cloudera.org:8080/12260 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-09 02:18:09 +00:00
Thomas Tauber-Marshall	b1e4957ba7	IMPALA-4555: Make QueryState's status reporting more robust QueryState periodically collects runtime profiles from all of its fragment instances and sends them to the coordinator. Previously, each time this happens, if the rpc fails, QueryState will retry twice after a configurable timeout and then cancel the fragment instances under the assumption that the coordinator no longer exists. We've found in real clusters that this logic is too sensitive to failed rpcs and can result in fragment instances being cancelled even in cases where the coordinator is still running. This patch makes a few improvements to this logic: - When a report fails to send, instead of retrying the same report quickly (after waiting report_status_retry_interval_ms), we wait the regular reporting interval (status_report_interval_ms), regenerate any stale portions of the report, and then retry. - A new flag, --status_report_max_retries, is introduced, which controls the number of failed reports that are allowed before the query is cancelled. --report_status_retry_interval_ms is removed. - Backoff is used for repeated failed attempts, such that for a period between retries of 't', on try 'n' the actual timeout will be t * n. Testing: - Added a test which results in a large number of failed intermediate status reports but still succeeds. Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971 Reviewed-on: http://gerrit.cloudera.org:8080/12049 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-07 22:21:51 +00:00
Andrew Sherman	e4cff7d0d6	IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (for example because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Reviewed-on: http://gerrit.cloudera.org:8080/12142 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-01-08 01:05:54 +00:00
Michael Ho	941038229a	IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states when multiple fragment instances send them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Reviewed-on: http://gerrit.cloudera.org:8080/11615 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-11-06 01:01:07 +00:00
Michael Ho	5391100c7e	IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe Reviewed-on: http://gerrit.cloudera.org:8080/10855 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-11-01 21:12:12 +00:00

20 Commits