mirror of
https://github.com/apache/impala.git
synced 2026-01-07 18:02:33 -05:00
This introduces KuduScanNodeMt, the single-threaded version of KuduScanNode that materializes the tuples in GetNext(). KuduScanNodeMt is enabled by the same condition as HdfsScanNodeMt: mt_dop is greater than or equal to 1. To share code between the two implementations, KuduScanNode and KuduScanNodeMt are now subclasses of KuduScanNodeBase, which implements the shared code. The KuduScanner is minimally impacted, as it already had the required GetNext interface. Since the KuduClient is a heavy-weight object, it is now shared at the QueryState level. We try to share the KuduClient as much as possible, but there are times when the KuduClient cannot be shared. Each Kudu table has master addresses stored in the Hive Metastore. We only share KuduClients for tables that have an identical value for the master addresses. In the ideal case, every Kudu table will have the same value, but there is no explicit guarantee of this. The testing for this is a modified version of kudu-scan-node.test run with various mt_dop values. Change-Id: I6e4593300e376bc508b78acaea64ffdd2c73a67a Reviewed-on: http://gerrit.cloudera.org:8080/6312 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Impala Public Jenkins
This directory contains Impala test workloads. The directory layout for the workloads should follow: workloads/ <data set name>/<data set name>_dimensions.csv <- The test dimension file <data set name>/<data set name>_core.csv <- A test vector file <data set name>/<data set name>_pairwise.csv <data set name>/<data set name>_exhaustive.csv <data set name>/queries/<query test>.test <- The queries for this workload