impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Anuj Phadke	a915293109	IMPALA-1850: Allow fs.defaultFS to be set to a non-HDFS filesystem This change whitelists the supported filesystems which can be set as Default FS for Impala to run on. This patch configures Impala to use S3 as the default filesystem, rather than a secondary filesystem as before. Change-Id: I2f45bef6c94ece634045acb906d12591587ccfed Reviewed-on: http://gerrit.cloudera.org:8080/1121 Reviewed-by: anujphadke <aphadke@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:17:40 -07:00
Juan Yu	4f61edee1d	IMPALA-2798: Bring in AVRO-1617 fix and add test case for it Impala could crash or return wrong result if it uses codegend avro decoding function to scan avro file that has different schema than table schema. With AVRO-1617 fix, we make sure Impala doesn't use codegen if table schema has less columns than file schema. Change-Id: I268419e421404ad6b084482dee417634f17ecf60 Reviewed-on: http://gerrit.cloudera.org:8080/1696 Reviewed-by: Juan Yu <jyu@cloudera.com> Tested-by: Internal Jenkins	2016-01-14 06:04:48 +00:00
ishaan	dee6911b20	Enable loading metadata from the hive metastore snapshot and cleanup build scripts. This patch contains the following changes: - Add a metastore_snapshot_file parameter to build.sh - Enable skipping loading the metadata. - create-load-data.sh is refactored into functions. - A lot of scripts source impala-config, which creates a lot of log spew. This has now been muted. - Unecessary log spew from compute-table-stats has been muted. - build_thirdparty.sh determins its parallelism from the system, it was previously hard coded to 4 - Only force load data of the particular dataset if a schema change is detected. Change-Id: I909336451e5c1ca57d21f040eb94c0e831546837 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5540 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-12-19 13:41:00 -08:00
Skye Wanderman-Milne	8e44347831	IMPALA-1149: read bytes fields as strings in HdfsAvroScanner::MaterializeTuple() Hive converts "bytes"-type fields to an array<tinyint> column, which we can't even load the metadata for. However, if a bytes field appears in a file schema but not the table schema, this change allows us to read (but not materialize) the field. Otherwise we can't read the file at all. This change also adds a "bytes"-type field to one of the files in functional_avro_snap.schema_resolution_test. Change-Id: I25953ee049e174fc4dbff5d68520a6f87e545339 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3823 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 0e2e7c1ac0f63623b7ec3724920e9927cd782508) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3895	2014-08-18 20:17:05 -07:00
Alex Behm	ce40134ad0	IMPALA-867: Fail COMPUTE STATS in analysis for Avro tables affected by HIVE-6308. Avro tables that were not created with a column-definition list do not have their columns properly populated in the Metastore backend DB (HIVE-6308). For such tables COMPUTE STATS and Hive's ANALYZE TABLE cannot succeed. This patch fails COMPUTE STATS in analysis for such broken Avro tables and adds tests for Avro tables with mismatched a column-definition list and Avro schema. Change-Id: I561ecea944ae2f83d69950b7a1ab9edaa89bdcea Reviewed-on: http://gerrit.ent.cloudera.com:8080/1892 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1920	2014-03-14 23:24:55 -07:00
Lenni Kuff	fc7733d530	Fix resolution of mismatched column names that come from the deserializer (ex. Avro tables) Fixes a bug (regression) where the catalog server was not properly resolving column names when a table's column definition did not match its Avro schema definition. The expected behavior in this case is that the the Avro scehma definition should be used instead of the table columns. We had no test tables that were mismatched so this wasn't caught. This loading of the schema and columns happens when a table's metadata is loaded, so the fix is to just add a toThrift() to Column and not reference metastore.getSd().getCols() directly since it might be the "wrong" set of columns. Change-Id: I341a3a8834f5748f90c246d2093ddb983ecfdd4f Reviewed-on: http://gerrit.ent.cloudera.com:8080/770 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:44 -08:00
Lenni Kuff	be1d42c05a	IMPALA-538: Look for Avro schema in SERDEPROPERTIES as well as TBLPROPERTIES Change-Id: If5c0b36d5a3963176b07a0cb1ea680e3e36b2f96 Reviewed-on: http://gerrit.ent.cloudera.com:8080/248 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:15 -08:00
Skye Wanderman-Milne	3fecdeb793	IMPALA-441: support default values for Avro tables	2014-01-08 10:51:39 -08:00

8 Commits