mirror of
https://github.com/apache/impala.git
synced 2026-01-07 09:02:19 -05:00
Also fixes: IMPALA-2400, IMPALA-3043
This change fixes scheduling scan-ranges on remote hosts by adding
remote backend selection capability to SimpleScheduler. Prior to this
change the scheduler would try to select a local backend even when
remote scheduling was requested.
This change also allows pseudo-randomized remote backend selection to
prevent convoying, which could happen when different independent
schedulers had the same internal state, e.g. after a cluster restart. To
enable the new behavior set the query option SCHEDULE_RANDOM_REPLICA to
true.
This change also fixes IMPALA-2400: Unpredictable locality behavior
for reading Parquet files
This change also fixes IMPALA-3043: SimpleScheduler does not handle
hosts with multiple IP addresses correctly
This change also does some clean-up in scheduler.h and
simple-scheduler.{h,cc}.
Change-Id: I044f83806fcde820fcb38047cf6b8e780d803858
Reviewed-on: http://gerrit.cloudera.org:8080/3771
Reviewed-by: Lars Volker <lv@cloudera.com>
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins