Files
impala/testdata/workloads/functional-query/queries/QueryTest/hbase-inserts.test
Lenni Kuff a2cbd2820e Add Catalog Service and support for automatic metadata refresh
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.

Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.

What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
  contains the change. An impalad will wait for the statestore heartbeat that contains this
  version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing

Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
  same JAR.

Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00

152 lines
4.2 KiB
Plaintext

====
---- QUERY
insert into table insertalltypesagg
select id, bigint_col, bool_col, date_string_col, day, double_col, float_col,
int_col, month, smallint_col, string_col, timestamp_col, tinyint_col, year from functional.alltypesagg
---- RESULTS
: 10000
====
---- QUERY
select id, bool_col from insertalltypesagg
WHERE id > 300
ORDER BY id
LIMIT 2
---- RESULTS
301,false
302,true
---- TYPES
INT, BOOLEAN
====
---- QUERY
insert into table insertalltypesagg
select 9999999, bigint_col, false, date_string_col, day, double_col, float_col,
int_col, month, smallint_col, string_col, timestamp_col, tinyint_col, year from functional.alltypesagg
---- RESULTS
: 10000
====
---- QUERY
select id, bool_col from insertalltypesagg
WHERE id = 9999999
ORDER BY id
LIMIT 2
---- RESULTS
9999999,false
---- TYPES
INT, BOOLEAN
====
---- QUERY
# test insert into ... select *
# using limit 1 to reduce execution time
insert into table insertalltypesagg
select * from insertalltypesagg limit 1
---- RESULTS
: 1
====
---- QUERY
# test inserting Hive's default text representation of NULL '\N'
# and make sure a scan returns the string and not NULL
insert into table insertalltypesagg
select 9999999, bigint_col, false, "\\N", day, double_col, float_col,
int_col, month, smallint_col, "\\N", timestamp_col, tinyint_col, year from functional.alltypesagg limit 1
---- RESULTS
: 1
====
---- QUERY
select id, date_string_col, string_col from insertalltypesagg
where id = 9999999
---- RESULTS
9999999,'\N','\N'
---- TYPES
INT, STRING, STRING
====
---- QUERY
insert into table insertalltypesaggbinary
select id, bigint_col, bool_col, date_string_col, day, double_col, float_col,
int_col, month, smallint_col, string_col, timestamp_col, tinyint_col, year from functional.alltypesagg
---- RESULTS
: 10000
====
---- QUERY
select count(*) from (
select hb.* from insertalltypesaggbinary hb, functional.alltypesagg a
where hb.id = a.id
and (hb.bigint_col = a.bigint_col or
(hb.bigint_col is null and a.bigint_col is null))
and (hb.bool_col = a.bool_col or
(hb.bool_col is null and a.bool_col is null))
and (hb.date_string_col = a.date_string_col or
(hb.date_string_col is null and a.date_string_col is null))
and (hb.double_col = a.double_col or
(hb.double_col is null and a.double_col is null))
and (hb.float_col = a.float_col or
(hb.float_col is null and a.float_col is null))
and (hb.int_col = a.int_col or
(hb.int_col is null and a.int_col is null))
and (hb.smallint_col = a.smallint_col or
(hb.smallint_col is null and a.smallint_col is null))
and (hb.tinyint_col = a.tinyint_col or
(hb.tinyint_col is null and a.tinyint_col is null))
and (hb.string_col = a.string_col or
(hb.string_col is null and a.string_col is null))
and (hb.timestamp_col = a.timestamp_col or
(hb.timestamp_col is null and a.timestamp_col is null))
) x
---- RESULTS
10000
---- TYPES
BIGINT
====
---- QUERY
select id, bool_col from insertalltypesaggbinary
WHERE id > 300
ORDER BY id
LIMIT 2
---- RESULTS
301,false
302,true
---- TYPES
INT, BOOLEAN
====
---- QUERY
insert into table insertalltypesaggbinary
select 9999999, bigint_col, false, date_string_col, day, double_col, float_col,
int_col, month, smallint_col, string_col, timestamp_col, tinyint_col, year from functional.alltypesagg
---- RESULTS
: 10000
====
---- QUERY
select id, bool_col from insertalltypesaggbinary
WHERE id = 9999999
ORDER BY id
LIMIT 2
---- RESULTS
9999999,false
---- TYPES
INT, BOOLEAN
====
---- QUERY
# test insert into ... select *
# using limit 1 to reduce execution time
insert into table insertalltypesaggbinary
select * from insertalltypesaggbinary limit 1
---- RESULTS
: 1
====
---- QUERY
# test inserting Hive's default text representation of NULL '\N'
# and make sure a scan returns the string and not NULL
insert into table insertalltypesaggbinary
select 9999999, bigint_col, false, "\\N", day, double_col, float_col,
int_col, month, smallint_col, "\\N", timestamp_col, tinyint_col, year from functional.alltypesagg limit 1
---- RESULTS
: 1
====
---- QUERY
select id, date_string_col, string_col from insertalltypesaggbinary
where id = 9999999
---- RESULTS
9999999,'\N','\N'
---- TYPES
INT, STRING, STRING
====