IMPALA-7779 Parquet Scanner can write binary data into profile

This fix addresses the current limitation in that an ill-formatted
Parquet version string is not properly formatted before appearing
in an error message or impalad.INFO. With the fix, any such string is
converted to a hex string first. The hex string is a sequence of
four hex digit groups separated by spaces and each group is one or
two hex digits, such as "6c 65 2e a".

Testing:
 Ran "core" tests successfully.

Change-Id: I281d6fa7cb2f88f04588110943e3e768678b9cf1
Reviewed-on: http://gerrit.cloudera.org:8080/16331
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Sahil Takiar <stakiar@cloudera.com>
This commit is contained in:
Qifan Chen
2020-08-12 16:33:51 -04:00
committed by Sahil Takiar
parent 6390e7e1da
commit 2ebf554dfd
3 changed files with 4 additions and 3 deletions

View File

@@ -50,7 +50,7 @@ bigint,bigint,string,string,boolean,boolean,bigint,bigint,bigint,bigint
# Parquet file with invalid magic number
SELECT * from bad_magic_number
---- CATCH
File '$NAMENODE/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet' has an invalid Parquet version number: XXXX
File '$NAMENODE/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet' has an invalid Parquet version number: 58 58 58 58
====
---- QUERY
# count(*) query on parquet file with multiple blocks (one block per node)