Commit Graph

12 Commits

Author SHA1 Message Date
Nong Li
a7beb12540 [CDH5] Fix column stats for decimal.
Change-Id: I72b31f6431bf6259e759fd290200fd1a755f82c6
2014-06-20 23:03:06 -07:00
Nong Li
52f2b2cb52 Fix overflow in decimal divide. Added warning if overflow happened.
Change-Id: I2e9167dbec83b3d1c2cf0e52fae4e09d6b5a38ce
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3141
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3191
2014-06-20 02:24:41 -07:00
Skye Wanderman-Milne
bbb908db1e Add HS2 GetLog() test
Change-Id: I24cc4a1873942cb4d67dcf75ed57ce7becec6f11
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3016
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 33f332f44c31fea747fadc56c7816c1da3b25b6c)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3040
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-06-13 18:39:07 -07:00
Srinath Shankar
d193a1e8a5 IMPALA-963: Impala crash in ClearResultCache()
The issue is that Impala crashes in ClearResultCache() with result caching on
for parallel inserts. The reason is that the ClearResltCache() accesses the
coordinator RuntimeState to update the query mem tracker. However, for there is
no coordinator fragment (or RuntimeState) for parallel inserts.
The fix is to intiialize a query mem tracker to track memory usage in the coordinator
instance even if there is no coordinator fragment.

Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-05-19 12:40:12 -07:00
Nong Li
4b883ac7eb Fix decimal bugs.
Fix overflow handling in a few cases and add decimal as hs2 type.

Change-Id: Ifde1988365f6be961e7eb7404ed37d7bbaab875c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2531
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2564
2014-05-16 00:17:38 -07:00
Alex Behm
6b769d011d Adds limited support for the FETCH_FIRST fetch orientation in HS2 client requests.
Adds a bounded query-result cache that clients can enable by setting
an 'impala.resultset.cache.size'  option in the HS2 confOverlay mapof the HS2 exec request.
Impala permits FETCH_FIRST for a particular stmt iff result caching is enabled.
FETCH_FIRST will succeed as long all previously fetched rows fit into the bounded
result cache. Regardless of whether a FETCH_FIRST succeeds or not, clients may
always resume fetching with FETCH_NEXT.

The FETCH_FIRST feature is intended to allow HUE users to export an entire
result set (to Excel, CSV, etc.) after browsing through a few pages of results,
without having ro re-run the query from scratch.

Change-Id: I71ab4794ddef30842594c5e1f7bc94724d6ce89f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1356
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1406
2014-01-30 14:58:46 -08:00
Lenni Kuff
af6d381401 IMPALA-565: Support user impersonation for HS2 authorization requests
This change adds support for user impersonation for HS2 authorization
requests. It adds a new flag (--authorized_proxy_user_config) that if
set, allows users (ex. hue) to impersonate as another user. The user they
wish to impersonate as is passed using the HS2 configuration property,
'impala.doas.user'.
The configuration allows for specifying the list of users a proxy user
can impersonate as well, or '*' to allow the proxy user to impersonate
any user. For example: hue=user1,user2,admin=*

Change-Id: I2a13e31e5bde2e6df47134458c803168415d0437
Reviewed-on: http://gerrit.ent.cloudera.com:8080/574
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:39 -08:00
Henry Robinson
80b4a3b306 IMPALA-619: Add full session metadata to HS2 metadata operations
Change-Id: Ib1ed3710781a4f530b16272af66f5fb48520c628
Reviewed-on: http://gerrit.ent.cloudera.com:8080/679
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:27 -08:00
Lenni Kuff
a2cbd2820e Add Catalog Service and support for automatic metadata refresh
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.

Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.

What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
  contains the change. An impalad will wait for the statestore heartbeat that contains this
  version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing

Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
  same JAR.

Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00
Henry Robinson
640a21802d IMPALA-564: Close HS2 sessions on socket close
This patch adds a SessionHandlerIf to the HiveServer2 Thrift
server. When a socket connection event happens, Impala will terminate
the associated HS2 sessions.

Since HS2 allows for multiple sessions to be multiplexed onto the same
socket connection, a list of session IDs associated with each connection
ID is added to ImpalaServer so that all sessions can be correctly
terminated. This patch also fixes the HS2 implementation so that all
sessions on the same socket get different session IDs.

This patch also adds two metrics to track the number of sessions
currently active for both HS2 and Beeswax APIs.

Change-Id: Ic6276c86b0ab842ac9f434afccd14ca49937bee8
Reviewed-on: http://gerrit.ent.cloudera.com:8080/343
Tested-by: jenkins
Reviewed-by: Alan Choi <alan@cloudera.com>
2014-01-08 10:52:33 -08:00
Henry Robinson
c9b92d574f Fix unclosed session in TestHS2
The HS2 server does not close sessions when the associated socket
connection fails (unlike Beeswax); if a session is not deliberately
closed it persists forever.

Change-Id: If2d0bfade26a27225023f4f80482bf34132c55c2
Reviewed-on: http://gerrit.ent.cloudera.com:8080/341
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:52:30 -08:00
Henry Robinson
49eeb8ef2e Add a few client tests for HS2
These HiveServer2 client tests (in tests/hs2) are intended to check the
HS2 API implementation in the following ways:

* API tests: Can all the supported API endpoints be called successfully?
* Query lifecycle tests: does calling the API in an unusual sequence
  result in reasonable behaviour?
* Malformed query tests: does sending an incorrectly constructed request
  correctly result in an error?

This patch adds a few simple tests as a starting point for a larger test suite.

Change-Id: I4b926d1639c640317ea3478bdeb0aa4b5a9286ee
Reviewed-on: http://gerrit.ent.cloudera.com:8080/320
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:52:29 -08:00