mirror of
https://github.com/apache/impala.git
synced 2025-12-23 21:08:39 -05:00
IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
There is a Hive bug in Hive 1.1.0 that can result in a NullPointerException when doing parallel Hive operations (see IMPALA-6532). Since dataload goes parallel on Hive loads starting with IMPALA-6372, dataload can hit this error on Hive 1.1.0 (i.e. IMPALA_MINICLUSTER_PROFILE=2). This is impacting builds on the 2.x branch. This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2. IMPALA_MINICLUSTER_PROFILE=3 uses a newer version of Hive that has a fix for this, so this continues to use parallel dataload for that case. Parallelism can be reenabled when Hive 1.1.0 gets the fix from Hive 2.1.1. Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6 Reviewed-on: http://gerrit.cloudera.org:8080/10306 Reviewed-by: Philip Zeyliger <philip@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
committed by
Impala Public Jenkins
parent
c35ec6c9bd
commit
b126b2d105
@@ -78,7 +78,7 @@ parser.add_option("--use_kerberos", action="store_true", default=False,
|
||||
help="Load data on a kerberized cluster.")
|
||||
parser.add_option("--principal", default=None, dest="principal",
|
||||
help="Kerberos service principal, required if --use_kerberos is set")
|
||||
parser.add_option("--num_processes", default=multiprocessing.cpu_count(),
|
||||
parser.add_option("--num_processes", type="int", default=multiprocessing.cpu_count(),
|
||||
dest="num_processes", help="Number of parallel processes to use.")
|
||||
|
||||
options, args = parser.parse_args()
|
||||
|
||||
Reference in New Issue
Block a user