This change adds support for authorizing based on policy metadata read from the Sentry
Service. Authorization is role based and roles are granted to user groups. Each role
can have zero or more privileges associated with it, granting fine grained access to
specific catalog objects at server, URI, database, or table scope. This patch only
adds support to authorize against metadata read from the Sentry Policy Service, it does
not add support for GRANT/REVOKE statements in Impala.
The authorization metadata is read by the catalog server from the Sentry Service and
propagated to all nodes in the cluster in the "catalog-update" statestore topic. To
enable the Catalog Server to read policy metadata, the --sentry_config must be
set to a valid sentry-site.xml config file.
On the impalad side, we continue to support authorization based on a file-based provider.
To enable file based authorization set the --authorization_policy_file to a
non-empty value. If --authorization_policy_file is not set, authorization will be done
based on cached policy metadata received from the Catalog Server (via the statestore).
TODO: There are still some issues with the Sentry Service that require disabling some of
the authorization tests and adding some workarounds. I have added comments in the code
where these workarounds are needed.
Change-Id: I3765748d2cdbe00f59eefa3c971558efede38eb1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2552
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
This should allow individual service components, such as a single nodemanager,
to be shutdown for failure testing. The mini-cluster bundled with hadoop is a
single process that does not expose the ability to control individual roles.
Now each role can be controlled and configured independently of the others.
Change-Id: Ic1d42e024226c6867e79916464d184fce886d783
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432
Tested-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
Changes include:
- version changes in impala-config
- version changes in various loading scripts
- hbase jars are no longer in hive/lib
- mini-llama script changes
- updates due to sentry api changes
- JDBC tests disabled
- unsupported types tests disabled.
Change-Id: If8cf1b7ad8e22aa4d23094b9a4b1047f7e9d93ee
Fixed codepath with rm disabled. Set enable_rm to false by default.
Change-Id: I3bf2d0525d91243ec3c0ea048b0c03680befcda2
Conflicts:
be/src/runtime/runtime-state.cc
This changes adds support for SQL statement authorization in Impala. The authorization
works by updating the Catalog API to require a User + Privilege when getting Table/Db
objects (and in the future can be extended to cover columns as well).
If the user doesn't have permission to access the object, an AuthorizationException is
thrown. The authorization checks are done during analysis as new Catalog objects are
encountered.
These changes build on top of the Hive Access code which handles the actually
processing of authorization requests. The authorization is currently based
on a "policy file" which will be stored in HDFS. This policy file is read once
on startup and then reloaded every 5 minutes. It can also be reloaded on a
specific impalad by executing a "refresh" command.
Authorization is enabled by setting:
--server_name='server1'
and then pointing the impalad to the policy file using the flag:
--authorization_policy_file=/path/to/policy/file
any authorization configuration problems will result in impalad failing to
start.
This change adds support for auxiliary worksloads, tests, and datasets. This is useful
to augment the regular test runs with some additional tests that do not belong in the
main Impala repo.