mirror of
https://github.com/apache/impala.git
synced 2025-12-26 14:02:53 -05:00
Impala supports reading Parquet files with multiple row groups but with possible performance degradation due to remote reads. This patch maximizes scan locality by allowing multiple impalads to scan the rowgroups in their local splits. Each impalad starts a new scan range for each split local to it if that split contains row group(s) that need to be scanned. Change-Id: Iaecc5fb8e89364780bc59dbfa9ae51d0d124d16e Reviewed-on: http://gerrit.cloudera.org:8080/908 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Internal Jenkins
2.6 MiB
Executable File
2.6 MiB
Executable File