mirror of
https://github.com/apache/impala.git
synced 2026-01-21 15:03:35 -05:00
Currently, we use the same hash seed for partitioning exchanges at the sender. For a table with skew in distribution in the shuffling keys, multiple queries using the same shuffling keys for exchanges will end up hashing to the same destination fragments running on a particular host and potentially overloading that host. This patch seeds the hash with query id. This will ensure that the partitioning exchanges do not always hash to the same destination with same shuffling keys. Testing: Added a test to data-stream-test to verify the data values at destination are different for different queries. Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468 Reviewed-on: http://gerrit.cloudera.org:8080/15497 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>