impala

mirror of https://github.com/apache/impala.git synced 2026-01-05 12:01:11 -05:00

Files

Skye Wanderman-Milne 8e44347831 IMPALA-1149: read bytes fields as strings in HdfsAvroScanner::MaterializeTuple()

Hive converts "bytes"-type fields to an array<tinyint> column, which
we can't even load the metadata for. However, if a bytes field appears
in a file schema but not the table schema, this change allows us to
read (but not materialize) the field. Otherwise we can't read the file
at all.

This change also adds a "bytes"-type field to one of the files in
functional_avro_snap.schema_resolution_test.

Change-Id: I25953ee049e174fc4dbff5d68520a6f87e545339
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3823
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 0e2e7c1ac0f63623b7ec3724920e9927cd782508)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3895

2014-08-18 20:17:05 -07:00

create_table.sql

IMPALA-867: Fail COMPUTE STATS in analysis for Avro tables affected by HIVE-6308.

2014-03-14 23:24:55 -07:00

file_schema1.avsc

IMPALA-1149: read bytes fields as strings in HdfsAvroScanner::MaterializeTuple()

2014-08-18 20:17:05 -07:00

file_schema2.avsc

IMPALA-441: support default values for Avro tables

2014-01-08 10:51:39 -08:00

README

IMPALA-441: support default values for Avro tables

2014-01-08 10:51:39 -08:00

records1.avro

IMPALA-1149: read bytes fields as strings in HdfsAvroScanner::MaterializeTuple()

2014-08-18 20:17:05 -07:00

records1.json

IMPALA-1149: read bytes fields as strings in HdfsAvroScanner::MaterializeTuple()

2014-08-18 20:17:05 -07:00

records2.avro

IMPALA-441: support default values for Avro tables

2014-01-08 10:51:39 -08:00

records2.json

IMPALA-441: support default values for Avro tables

2014-01-08 10:51:39 -08:00

README

This folder contains the files necessary to test Impala support for Avro schema resolution
(along with the TestAvroSchemaResolution query test).

create_table.sql creates a functional_avro_snap.schema_resolution_test table and loads
records1.avro and records2.avro. The .avro files were created via the following commands:

java -jar ~/avro-tools-1.7.4.jar fromjson --schema-file file_schema1.avsc --codec snappy records1.json > records1.avro
java -jar ~/avro-tools-1.7.4.jar fromjson --schema-file file_schema2.avsc --codec snappy records2.json > records2.avro

create_table.sql, file_schema1.avsc and file_schema2.avsc contain the relevant schema definitions.