Commit Graph

5 Commits

Author SHA1 Message Date
Bharath Vissapragada
bb63339377 IMPALA-3314: Fix Avro schema loading for partitioned tables.
Bug: Commit 6f31c7 fixed a crash when setting Avro schemas for
tables with storage altered to Avro file format. However the
fix was incomplete for partitioned/multi file format tables since
'hasAvroData_' is not set for all code paths that load the
partitioned tables (For example: HdfsTable#loadAllPartitions()).

Fix: Moved the code for setting 'hasAvroData_' to addPartition()
which is the common logic for all code paths adding new partitions.
Also fixed the test coverage gap by adding a new test for partitioned
tables altered to Avro format.

Change-Id: I7854ff002b2277ec4a5388216218a1d5ad142de8
Reviewed-on: http://gerrit.cloudera.org:8080/5388
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2016-12-07 09:45:11 +00:00
Thomas Tauber-Marshall
e6e2baea33 IMPALA-4372: 'Describe formatted' returns types in upper case
A recent change caused 'describe formatted' to display the types
in all upper case, but we want 'describe formatted' to match Hive's
'describe' output, which displays the types in lower case.

This patch also fixes several problems with test_describe_formatted,
which was encountering an error but reporting success.

Change-Id: I274b97d4d1247244247fb38a5ca7f4c10bba8d22
Reviewed-on: http://gerrit.cloudera.org:8080/4861
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
2016-11-15 05:38:12 +00:00
Lars Volker
0e886618e2 IMPALA-3776: fix 'describe formatted' for Avro tables
For Avro tables the column information in the underlying database of the
Hive metastore can be different from what is specified in the avro
schema. HIVE-6308 aimed to improve upon this, but for older tables the
two don't necessarily align.

There are two possible cases:

1) Hive's underlying database contains a column which is not present in
the Avro schema file. In this case we encounter a NullPointerException
in DescribeResultFactory.java#L189 when trying to look up the column in
the internal table object.

2) The Avro schema contains a column, which is not present in the
underlying database. In this case the column will not be displayed in
describe formatted.

In addition to the automatic tests I verified this manually by creating
an Avro table with an external schema file in Hive. This populated the
underlying database with the column information. I then either removed
a column from the Avro schema file (case 1) or cleared the column
information from the "COLUMNS_V2" table in the underlying database
(case 2) and verified that the change fixed both cases.

Change-Id: Ieb69d3678e662465d40aee80ba23132ea13871a0
Reviewed-on: http://gerrit.cloudera.org:8080/4126
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
Reviewed-by: Jim Apple <jbapple@cloudera.com>
2016-08-26 17:20:10 +00:00
Huaisi Xu
c6ce32b3b6 IMPALA-3687: Prefer Avro field name during schema reconciliation
Since it is possible to create an Avro table with both column
definitions and an Avro schema, Impala attempts to reconcile
inconsistencies in the two schema definitions, generally preferring the
Avro schema. The only exception to this rule was with
CHAR/VARCHAR/STRING columns, where the column definition was preferred
in order to support tables with CHAR/VARCHAR columns although Avro only
supports STRING. This exception is confusing because the name for such a
column will be taken from the column definition (and not from the Avro
schema).

This patch prefers name, comment from Avro schema definition and
uses column type from column definition for CHAR/VARCHAR/STRING
columns.

Change-Id: Ia3e43b2885853c2b4f207a45a873c9d7f31379cd
Reviewed-on: http://gerrit.cloudera.org:8080/3331
Reviewed-by: Huaisi Xu <hxu@cloudera.com>
Tested-by: Internal Jenkins
2016-07-14 19:04:43 +00:00
Huaisi Xu
816735a032 IMPALA-3092: Set default value to NULL in AvroSchemaConverter
This change ensures that Avro tables created without column definitions
remain queryable if columns are added via ALTER TABLE. The bug was that
when synthesizing an Avro schema from the column definitions we used to
not add default values.

Change-Id: Ib86e9ba1f4329b285ae14ee299365f7291a7410e
Reviewed-on: http://gerrit.cloudera.org:8080/3219
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2016-05-31 23:32:11 -07:00