impala

jprdonnelly/impala

Fork 0

mirror of https://github.com/apache/impala.git synced 2025-12-20 10:29:58 -05:00

Commit Graph

Author	SHA1	Message	Date
Andrew Sherman	ea13e74497	IMPALA-10309: Use sleep time from a Retry-After header in Impala Shell When Impala Shell receives an http error message (that is a message with http code greater than or equal to 300), it may sleep for a time before retrying. If the message contains a 'Retry-After' header that has an integer value, then this will be used as the time for which to sleep. The implementation is to use a new HttpError exception (similar to that used in Impyla) which includes more information from the error message (including the headers) so that catchers of the exception can use the 'Retry-After' header if appropriate. TESTING: Hand testing with a proxy that uses the 'Retry-After' header. Added new tests that use the fault injection framework in test_hs2_fault_injection.py Change-Id: I2b4226e7723d585d61deb4d1d6777aac901bfd93 Reviewed-on: http://gerrit.cloudera.org:8080/16702 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-11-11 07:08:12 +00:00
Sahil Takiar	3088ca8580	IMPALA-9818: Add fetch size as option to impala shell Adds the option --fetch_size to the Impala shell. This new option allows users to specify the fetch size used when issuing fetch RPCs to the Impala Coordinator (e.g. TFetchResultsReq and BeeswaxService.fetch). This parameter applies for all client protocols: beeswax, hs2, hs2-http. The default --fetch_size is set to 10240 (10x the default batch size). The new --fetch_size parameter is most effective when result spooling is enabled. When result spooling is disabled, Impala can only return a single row batch per fetch RPC (so 1024 rows by default). When result spooling is enabled, Impala can return up to 100 row batches per fetch request. Removes some logic in the the impala_client.py file that attempts to simulate a fetch_size. The code would issue multiple fetch requests to fullfill the given fetch_size. This logic is no longer needed now that result spooling is available. Testing: * Ran core tests * Added new tests in test_shell_client.py and test_shell_commandline.py Change-Id: I8dc7962aada6b38795241d067a99bd94fabca57b Reviewed-on: http://gerrit.cloudera.org:8080/16041 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Sahil Takiar <stakiar@cloudera.com>	2020-06-10 17:46:21 +00:00
Abhishek Rawat	fc784f6e95	IMPALA-9466: impala-shell client retry for hs2-http protocol Added retries for idempotent rpcs: OpenSession, PingImpalaHS2Service, GetResultSetMetadata, CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus, GetRuntimeProfile, GetExecSummary, GetLog Retries were also added to the 'set all' query execution and subsequent result fetch in the ImpalaHS2Client._open_session() The retries are only supported for hs2-http protocol and enabled by default. At most there are 3 retries for a failed rpc. There is a sleep duration of 'n' seconds after nth retry. Only failed rpcs due to an error in the http transport are retried and if an rpc failed because the server returned an error in the rpc response then such scenarios are not retriable. Improved error diagnostics by dumping stack trace when ImpalaShell. _execute_stmt() gets an 'Unknown Exception'. Testing: - Added a custom_cluster test which injects fault into the http transport and checks expected behavior from the various rpcs. Some of these tests leave the session in an open state and so these tests are not suitable for the e2e test framework which have metric verifiers expecting related metrics to be 0 at the end of the test. - Manually tested real world scenarios with impala-shell client communicating with an impala coordinator via a fault injecting istio mesh. - Manually tested dropping connections on an nginx ingress gateway by sending SIGTERM to all worker processes. Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Reviewed-on: http://gerrit.cloudera.org:8080/15378 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-03-28 00:32:18 +00:00

Author

SHA1

Message

Date

Andrew Sherman

ea13e74497

IMPALA-10309: Use sleep time from a Retry-After header in Impala Shell

When Impala Shell receives an http error message (that is a message with
http code greater than or equal to 300), it may sleep for a time before
retrying. If the message contains a 'Retry-After' header that has an
integer value, then this will be used as the time for which to sleep.

The implementation is to use a new HttpError exception (similar to that
used in Impyla) which includes more information from the error message
(including the headers) so that catchers of the exception can use the
'Retry-After' header if appropriate.

TESTING:
    Hand testing with a proxy that uses the 'Retry-After' header.
    Added new tests that use the fault injection framework in
    test_hs2_fault_injection.py

Change-Id: I2b4226e7723d585d61deb4d1d6777aac901bfd93
Reviewed-on: http://gerrit.cloudera.org:8080/16702
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2020-11-11 07:08:12 +00:00

Sahil Takiar

3088ca8580

IMPALA-9818: Add fetch size as option to impala shell

Adds the option --fetch_size to the Impala shell. This new option allows
users to specify the fetch size used when issuing fetch RPCs to the
Impala Coordinator (e.g. TFetchResultsReq and BeeswaxService.fetch).
This parameter applies for all client protocols: beeswax, hs2, hs2-http.
The default --fetch_size is set to 10240 (10x the default batch size).

The new --fetch_size parameter is most effective when result spooling is
enabled. When result spooling is disabled, Impala can only return a
single row batch per fetch RPC (so 1024 rows by default). When result
spooling is enabled, Impala can return up to 100 row batches per fetch
request.

Removes some logic in the the impala_client.py file that attempts to
simulate a fetch_size. The code would issue multiple fetch requests to
fullfill the given fetch_size. This logic is no longer needed now that
result spooling is available.

Testing:
* Ran core tests
* Added new tests in test_shell_client.py and test_shell_commandline.py

Change-Id: I8dc7962aada6b38795241d067a99bd94fabca57b
Reviewed-on: http://gerrit.cloudera.org:8080/16041
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Sahil Takiar <stakiar@cloudera.com>

2020-06-10 17:46:21 +00:00

Abhishek Rawat

fc784f6e95

IMPALA-9466: impala-shell client retry for hs2-http protocol

Added retries for idempotent rpcs:
OpenSession, PingImpalaHS2Service, GetResultSetMetadata,
CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus,
GetRuntimeProfile, GetExecSummary, GetLog

Retries were also added to the 'set all' query execution and subsequent
result fetch in the ImpalaHS2Client._open_session()

The retries are only supported for hs2-http protocol and enabled by
default. At most there are 3 retries for a failed rpc. There is a sleep
duration of 'n' seconds after nth retry.

Only failed rpcs due to an error in the http transport are retried and
if an rpc failed because the server returned an error in the rpc
response then such scenarios are not retriable.

Improved error diagnostics by dumping stack trace when ImpalaShell.
_execute_stmt() gets an 'Unknown Exception'.

Testing:
- Added a custom_cluster test which injects fault into the http transport
and checks expected behavior from the various rpcs. Some of these tests
leave the session in an open state and so these tests are not suitable
for the e2e test framework which have metric verifiers expecting related
metrics to be 0 at the end of the test.
- Manually tested real world scenarios with impala-shell client
communicating with an impala coordinator via a fault injecting istio mesh.
- Manually tested dropping connections on an nginx ingress gateway by sending
SIGTERM to all worker processes.

Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Reviewed-on: http://gerrit.cloudera.org:8080/15378
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2020-03-28 00:32:18 +00:00

3 Commits