The issue is that Impala crashes in ClearResultCache() with result caching on
for parallel inserts. The reason is that the ClearResltCache() accesses the
coordinator RuntimeState to update the query mem tracker. However, for there is
no coordinator fragment (or RuntimeState) for parallel inserts.
The fix is to intiialize a query mem tracker to track memory usage in the coordinator
instance even if there is no coordinator fragment.
Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
Adds a bounded query-result cache that clients can enable by setting
an 'impala.resultset.cache.size' option in the HS2 confOverlay mapof the HS2 exec request.
Impala permits FETCH_FIRST for a particular stmt iff result caching is enabled.
FETCH_FIRST will succeed as long all previously fetched rows fit into the bounded
result cache. Regardless of whether a FETCH_FIRST succeeds or not, clients may
always resume fetching with FETCH_NEXT.
The FETCH_FIRST feature is intended to allow HUE users to export an entire
result set (to Excel, CSV, etc.) after browsing through a few pages of results,
without having ro re-run the query from scratch.
Change-Id: I71ab4794ddef30842594c5e1f7bc94724d6ce89f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1356
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1406
This change adds support for user impersonation for HS2 authorization
requests. It adds a new flag (--authorized_proxy_user_config) that if
set, allows users (ex. hue) to impersonate as another user. The user they
wish to impersonate as is passed using the HS2 configuration property,
'impala.doas.user'.
The configuration allows for specifying the list of users a proxy user
can impersonate as well, or '*' to allow the proxy user to impersonate
any user. For example: hue=user1,user2,admin=*
Change-Id: I2a13e31e5bde2e6df47134458c803168415d0437
Reviewed-on: http://gerrit.ent.cloudera.com:8080/574
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.
Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.
What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
contains the change. An impalad will wait for the statestore heartbeat that contains this
version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing
Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
same JAR.
Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
This patch adds a SessionHandlerIf to the HiveServer2 Thrift
server. When a socket connection event happens, Impala will terminate
the associated HS2 sessions.
Since HS2 allows for multiple sessions to be multiplexed onto the same
socket connection, a list of session IDs associated with each connection
ID is added to ImpalaServer so that all sessions can be correctly
terminated. This patch also fixes the HS2 implementation so that all
sessions on the same socket get different session IDs.
This patch also adds two metrics to track the number of sessions
currently active for both HS2 and Beeswax APIs.
Change-Id: Ic6276c86b0ab842ac9f434afccd14ca49937bee8
Reviewed-on: http://gerrit.ent.cloudera.com:8080/343
Tested-by: jenkins
Reviewed-by: Alan Choi <alan@cloudera.com>
The HS2 server does not close sessions when the associated socket
connection fails (unlike Beeswax); if a session is not deliberately
closed it persists forever.
Change-Id: If2d0bfade26a27225023f4f80482bf34132c55c2
Reviewed-on: http://gerrit.ent.cloudera.com:8080/341
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
These HiveServer2 client tests (in tests/hs2) are intended to check the
HS2 API implementation in the following ways:
* API tests: Can all the supported API endpoints be called successfully?
* Query lifecycle tests: does calling the API in an unusual sequence
result in reasonable behaviour?
* Malformed query tests: does sending an incorrectly constructed request
correctly result in an error?
This patch adds a few simple tests as a starting point for a larger test suite.
Change-Id: I4b926d1639c640317ea3478bdeb0aa4b5a9286ee
Reviewed-on: http://gerrit.ent.cloudera.com:8080/320
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins