Files
impala/tests/metadata
stiga-huang 030c12ab2c IMPALA-9492: Fix test_unescaped_string_partition failing on S3
test_unescaped_string_partition in metadata/test_recover_partitions.py
use hdfs clients to create four partition directories with special
characters, i.e. single quote, double quotes and back slash. It aims to
test on whether ALTER TABLE RECOVER PARTITIONS can recognize those
directories correctly. However, when running against s3, only two
directories are created as expected, which causes the failure.

The reason is that when running against s3, we use hadoop cli for
operations. A shell command will be launched for each operation. Passing
arguments through shell results in duplicate unescaping. So the 4 dirs,
[p=', p=", p=\', p=\"] finally became [p=', p=", p=', p="], resulting in
two distinct directories. When the test running against hdfs, we use
webhdfs_client so don't have this issue.

Actually, we shouldn't use special characters in partition paths. Hive
converts them to their ascii hex values when creating partition
directories. E.g. partition paths of [p=', p=", p=\', p=\"] are
[p=%27, p=%22, p=%5C%27, p=%5C%22]. We should follow this rule when
creating directories in test. Also we won't have the above shell issue
on s3 anymore.

Tests:
 - Added two more special partitions in test_unescaped_string_partition.
 - Ran test_unescaped_string_partition in S3.

Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb
Reviewed-on: http://gerrit.cloudera.org:8080/15475
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-20 05:21:53 +00:00
..