mirror of
https://github.com/apache/impala.git
synced 2025-12-30 12:02:10 -05:00
This now gives a clean RAT check with bin/check-rat-report.py, which is one way for the Impala community to check compliance with ASF rules on intellectual property. Change-Id: I2ad06435f84a65ba126759e42a18fdaf52cd7036 Reviewed-on: http://gerrit.cloudera.org:8080/5232 Reviewed-by: Jim Apple <jbapple-impala@apache.org> Tested-by: Impala Public Jenkins Reviewed-by: John Russell <jrussell@cloudera.com>
1640 lines
64 KiB
XML
1640 lines
64 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept rev="1.1" id="authorization">
|
|
|
|
<title>Enabling Sentry Authorization for Impala</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Security"/>
|
|
<data name="Category" value="Sentry"/>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Configuring"/>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
<data name="Category" value="Users"/>
|
|
<data name="Category" value="Groups"/>
|
|
<data name="Category" value="Administrators"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody id="sentry">
|
|
|
|
<p>
|
|
Authorization determines which users are allowed to access which resources, and what operations they are
|
|
allowed to perform. In Impala 1.1 and higher, you use the Sentry open source project for authorization.
|
|
Sentry adds a fine-grained authorization framework for Hadoop. By default (when authorization is not
|
|
enabled), Impala does all read and write operations with the privileges of the <codeph>impala</codeph> user,
|
|
which is suitable for a development/test environment but not for a secure production environment. When
|
|
authorization is enabled, Impala uses the OS user ID of the user who runs <cmdname>impala-shell</cmdname> or
|
|
other client program, and associates various privileges with each user.
|
|
</p>
|
|
|
|
<note>
|
|
Sentry is typically used in conjunction with Kerberos authentication, which defines which hosts are allowed
|
|
to connect to each server. Using the combination of Sentry and Kerberos prevents malicious users from being
|
|
able to connect by creating a named account on an untrusted machine. See
|
|
<xref href="impala_kerberos.xml#kerberos"/> for details about Kerberos authentication.
|
|
</note>
|
|
|
|
<p audience="PDF" outputclass="toc inpage">
|
|
See the following sections for details about using the Impala authorization features:
|
|
</p>
|
|
</conbody>
|
|
|
|
<concept id="sentry_priv_model">
|
|
|
|
<title>The Sentry Privilege Model</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Privileges can be granted on different objects in the schema. Any privilege that can be granted is
|
|
associated with a level in the object hierarchy. If a privilege is granted on a container object in the
|
|
hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other
|
|
database systems such as MySQL.
|
|
</p>
|
|
|
|
<p rev="2.3.0 collevelauth">
|
|
The object hierarchy for Impala covers Server, URI, Database, Table, and Column. (The Table privileges apply to views as well;
|
|
anywhere you specify a table name, you can specify a view name instead.)
|
|
Column-level authorization is available in <keyword keyref="impala23_full"/> and higher, as described in
|
|
<xref audience="integrated" href="sg_hive_sql.xml#concept_c2q_4qx_p4/col_level_auth_sentry"/><xref audience="standalone" href="https://www.cloudera.com/documentation/enterprise/latest/topics/sg_hive_sql.html" format="html" scope="external"/>.
|
|
Previously, you constructed views to query specific columns and assigned privileges
|
|
based on the views rather than the base tables.
|
|
</p>
|
|
|
|
<p>
|
|
A restricted set of privileges determines what you can do with each object:
|
|
</p>
|
|
|
|
<dl>
|
|
<dlentry id="select_priv">
|
|
|
|
<dt>
|
|
SELECT privilege
|
|
</dt>
|
|
|
|
<dd>
|
|
Lets you read data from a table or view, for example with the <codeph>SELECT</codeph> statement, the
|
|
<codeph>INSERT...SELECT</codeph> syntax, or <codeph>CREATE TABLE...LIKE</codeph>. Also required to
|
|
issue the <codeph>DESCRIBE</codeph> statement or the <codeph>EXPLAIN</codeph> statement for a query
|
|
against a particular table. Only objects for which a user has this privilege are shown in the output
|
|
for <codeph>SHOW DATABASES</codeph> and <codeph>SHOW TABLES</codeph> statements. The
|
|
<codeph>REFRESH</codeph> statement and <codeph>INVALIDATE METADATA</codeph> statements only access
|
|
metadata for tables for which the user has this privilege.
|
|
</dd>
|
|
|
|
</dlentry>
|
|
|
|
<dlentry id="insert_priv">
|
|
|
|
<dt>
|
|
INSERT privilege
|
|
</dt>
|
|
|
|
<dd>
|
|
Lets you write data to a table. Applies to the <codeph>INSERT</codeph> and <codeph>LOAD DATA</codeph>
|
|
statements.
|
|
</dd>
|
|
|
|
</dlentry>
|
|
|
|
<dlentry id="all_priv">
|
|
|
|
<dt>
|
|
ALL privilege
|
|
</dt>
|
|
|
|
<dd>
|
|
Lets you create or modify the object. Required to run DDL statements such as <codeph>CREATE
|
|
TABLE</codeph>, <codeph>ALTER TABLE</codeph>, or <codeph>DROP TABLE</codeph> for a table,
|
|
<codeph>CREATE DATABASE</codeph> or <codeph>DROP DATABASE</codeph> for a database, or <codeph>CREATE
|
|
VIEW</codeph>, <codeph>ALTER VIEW</codeph>, or <codeph>DROP VIEW</codeph> for a view. Also required for
|
|
the URI of the <q>location</q> parameter for the <codeph>CREATE EXTERNAL TABLE</codeph> and
|
|
<codeph>LOAD DATA</codeph> statements.
|
|
<!-- Have to think about the best wording, how often to repeat, how best to conref this caveat.
|
|
You do not actually code the keyword <codeph>ALL</codeph> in the policy file; instead you use
|
|
<codeph>action=*</codeph> or shorten the right-hand portion of the rule.
|
|
-->
|
|
</dd>
|
|
|
|
</dlentry>
|
|
</dl>
|
|
|
|
<p>
|
|
Privileges can be specified for a table or view before that object actually exists. If you do not have
|
|
sufficient privilege to perform an operation, the error message does not disclose if the object exists or
|
|
not.
|
|
</p>
|
|
|
|
<p>
|
|
Originally, privileges were encoded in a policy file, stored in HDFS. This mode of operation is still an
|
|
option, but the emphasis of privilege management is moving towards being SQL-based. Although currently
|
|
Impala does not have <codeph>GRANT</codeph> or <codeph>REVOKE</codeph> statements, Impala can make use of
|
|
privileges assigned through <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements done through
|
|
Hive. The mode of operation with <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements instead of
|
|
the policy file requires that a special Sentry service be enabled; this service stores, retrieves, and
|
|
manipulates privilege information stored inside the metastore database.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="secure_startup">
|
|
|
|
<title>Starting the impalad Daemon with Sentry Authorization Enabled</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
To run the <cmdname>impalad</cmdname> daemon with authorization enabled, you add one or more options to the
|
|
<codeph>IMPALA_SERVER_ARGS</codeph> declaration in the <filepath>/etc/default/impala</filepath>
|
|
configuration file:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
The <codeph>-server_name</codeph> option turns on Sentry authorization for Impala. The authorization
|
|
rules refer to a symbolic server name, and you specify the name to use as the argument to the
|
|
<codeph>-server_name</codeph> option.
|
|
</li>
|
|
|
|
<li rev="1.4.0">
|
|
If you specify just <codeph>-server_name</codeph>, Impala uses the Sentry service for authorization,
|
|
relying on the results of <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements issued through
|
|
Hive. (This mode of operation is available in Impala 1.4.0 and higher.) Prior to Impala 1.4.0, or if you
|
|
want to continue storing privilege rules in the policy file, also specify the
|
|
<codeph>-authorization_policy_file</codeph> option as in the following item.
|
|
</li>
|
|
|
|
<li>
|
|
Specifying the <codeph>-authorization_policy_file</codeph> option in addition to
|
|
<codeph>-server_name</codeph> makes Impala read privilege information from a policy file, rather than
|
|
from the metastore database. The argument to the <codeph>-authorization_policy_file</codeph> option
|
|
specifies the HDFS path to the policy file that defines the privileges on different schema objects.
|
|
</li>
|
|
</ul>
|
|
|
|
<p rev="1.4.0">
|
|
For example, you might adapt your <filepath>/etc/default/impala</filepath> configuration to contain lines
|
|
like the following. To use the Sentry service rather than the policy file:
|
|
</p>
|
|
|
|
<codeblock rev="1.4.0">IMPALA_SERVER_ARGS=" \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
Or to use the policy file, as in releases prior to Impala 1.4:
|
|
</p>
|
|
|
|
<codeblock>IMPALA_SERVER_ARGS=" \
|
|
-authorization_policy_file=/user/hive/warehouse/auth-policy.ini \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to the current instance
|
|
of Impala. This symbolic name is used in the following ways:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
In an environment managed by Cloudera Manager, the server name is specified through
|
|
<menucascade><uicontrol>Impala (Service-Wide)</uicontrol><uicontrol>Category</uicontrol><uicontrol>Advanced</uicontrol><uicontrol>Sentry Service</uicontrol></menucascade> and
|
|
<menucascade><uicontrol>Hive</uicontrol><uicontrol>Service-Wide</uicontrol><uicontrol>Advanced</uicontrol><uicontrol>Sentry Service</uicontrol></menucascade>.
|
|
The values must be the same for both, so that Impala and Hive can share the privilege rules. Restart
|
|
the Impala and Hive services after setting or changing this value.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
In an environment not managed by Cloudera Manager, you specify this value for the
|
|
<codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath> configuration
|
|
file for Hive, as well as in the <codeph>-server_name</codeph> option for <cmdname>impalad</cmdname>.
|
|
</p>
|
|
<p>
|
|
If the <cmdname>impalad</cmdname> daemon is not already running, start it as described in
|
|
<xref href="impala_processes.xml#processes"/>. If it is already running, restart it with the command
|
|
<codeph>sudo /etc/init.d/impala-server restart</codeph>. Run the appropriate commands on all the nodes
|
|
where <cmdname>impalad</cmdname> normally runs.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
If you use the mode of operation using the policy file, the rules in the <codeph>[roles]</codeph>
|
|
section of the policy file refer to this same <codeph>server1</codeph> name. For example, the following
|
|
rule sets up a role <codeph>report_generator</codeph> that lets users with that role query any table in
|
|
a database named <codeph>reporting_db</codeph> on a node where the <cmdname>impalad</cmdname> daemon
|
|
was started up with the <codeph>-server_name=server1</codeph> option:
|
|
</p>
|
|
<codeblock>[roles]
|
|
report_generator = server=server1->db=reporting_db->table=*->action=SELECT
|
|
</codeblock>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
When <cmdname>impalad</cmdname> is started with one or both of the <codeph>-server_name=server1</codeph>
|
|
and <codeph>-authorization_policy_file</codeph> options, Impala authorization is enabled. If Impala detects
|
|
any errors or inconsistencies in the authorization settings or the policy file, the daemon refuses to
|
|
start.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="sentry_service">
|
|
|
|
<title>Using Impala with the Sentry Service (<keyword keyref="impala14"/> or higher only)</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
When you use the Sentry service rather than the policy file, you set up privileges through
|
|
<codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statement in either Impala or Hive, then both components
|
|
use those same privileges automatically. (Impala added the <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements in <keyword keyref="impala20_full"/>.)
|
|
</p>
|
|
|
|
<p>
|
|
Hive already had <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements prior to CDH 5.1, but those
|
|
statements were not production-ready. CDH 5.1 is the first release where those statements use the Sentry
|
|
framework and are considered GA level. If you used the Hive <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements prior to CDH 5.1, you must set up these privileges with the CDH 5.1
|
|
versions of <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> to take advantage of Sentry authorization.
|
|
</p>
|
|
|
|
<p>
|
|
For information about using the updated Hive <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements,
|
|
see
|
|
<!-- Original URL: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Security-Guide/cdh_sg_sentry_service.html -->
|
|
<xref href="http://www.cloudera.com/documentation/enterprise/latest/topics/cm_sg_sentry_service.html" scope="external" format="html">Sentry
|
|
service</xref> topic in the <cite>CDH 5 Security Guide</cite>.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="security_policy_file">
|
|
|
|
<title>Using Impala with the Sentry Policy File</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The policy file is a file that you put in a designated location in HDFS, and is read during the startup of
|
|
the <cmdname>impalad</cmdname> daemon when you specify both the <codeph>-server_name</codeph> and
|
|
<codeph>-authorization_policy_file</codeph> startup options. It controls which objects (databases, tables,
|
|
and HDFS directory paths) can be accessed by the user who connects to <cmdname>impalad</cmdname>, and what
|
|
operations that user can perform on the objects.
|
|
</p>
|
|
|
|
<note rev="1.4.0">
|
|
<p rev="1.4.0">
|
|
In <ph rev="upstream">CDH 5</ph> and higher, <ph rev="upstream">Cloudera</ph> recommends
|
|
managing privileges through SQL statements, as described in
|
|
<xref href="impala_authorization.xml#sentry_service"/>. If you are still using policy files, plan to
|
|
migrate to the new approach some time in the future.
|
|
</p>
|
|
</note>
|
|
|
|
<p>
|
|
The location of the policy file is listed in the <filepath>auth-site.xml</filepath> configuration file. To
|
|
minimize overhead, the security information from this file is cached by each <cmdname>impalad</cmdname>
|
|
daemon and refreshed automatically, with a default interval of 5 minutes. After making a substantial change
|
|
to security policies, restart all Impala daemons to pick up the changes immediately.
|
|
</p>
|
|
|
|
<p outputclass="toc inpage"/>
|
|
</conbody>
|
|
|
|
<concept id="security_policy_file_details">
|
|
|
|
<title>Policy File Location and Format</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The policy file uses the familiar <codeph>.ini</codeph> format, divided into the major sections
|
|
<codeph>[groups]</codeph> and <codeph>[roles]</codeph>. There is also an optional
|
|
<codeph>[databases]</codeph> section, which allows you to specify a specific policy file for a particular
|
|
database, as explained in <xref href="#security_multiple_policy_files"/>. Another optional section,
|
|
<codeph>[users]</codeph>, allows you to override the OS-level mapping of users to groups; that is an
|
|
advanced technique primarily for testing and debugging, and is beyond the scope of this document.
|
|
</p>
|
|
|
|
<p>
|
|
In the <codeph>[groups]</codeph> section, you define various categories of users and select which roles
|
|
are associated with each category. The group and usernames correspond to Linux groups and users on the
|
|
server where the <cmdname>impalad</cmdname> daemon runs.
|
|
</p>
|
|
|
|
<p>
|
|
The group and usernames in the <codeph>[groups]</codeph> section correspond to Linux groups and users on
|
|
the server where the <cmdname>impalad</cmdname> daemon runs. When you access Impala through the
|
|
<cmdname>impalad</cmdname> interpreter, for purposes of authorization, the user is the logged-in Linux
|
|
user and the groups are the Linux groups that user is a member of. When you access Impala through the
|
|
ODBC or JDBC interfaces, the user and password specified through the connection string are used as login
|
|
credentials for the Linux server, and authorization is based on that username and the associated Linux
|
|
group membership.
|
|
</p>
|
|
|
|
<p>
|
|
In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you specify precisely the set
|
|
of privileges is available. That is, which objects users with that role can access, and what operations
|
|
they can perform on those objects. This is the lowest-level category of security information; the other
|
|
sections in the policy file map the privileges to higher-level divisions of groups and users. In the
|
|
<codeph>[groups]</codeph> section, you specify which roles are associated with which groups. The group
|
|
and usernames correspond to Linux groups and users on the server where the <cmdname>impalad</cmdname>
|
|
daemon runs. The privileges are specified using patterns like:
|
|
<codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=CREATE
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL
|
|
</codeblock>
|
|
For the <varname>server_name</varname> value, substitute the same symbolic name you specify with the
|
|
<cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You can use <codeph>*</codeph> wildcard
|
|
characters at each level of the privilege specification to allow access to all such objects. For example:
|
|
<codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
|
|
server=impala-host.example.com->db=*->table=*->action=CREATE
|
|
server=impala-host.example.com->db=*->table=audit_log->action=SELECT
|
|
server=impala-host.example.com->db=default->table=t1->action=*
|
|
</codeblock>
|
|
</p>
|
|
|
|
<p>
|
|
When authorization is enabled, Impala uses the policy file as a <i>whitelist</i>, representing every
|
|
privilege available to any user on any object. That is, only operations specified for the appropriate
|
|
combination of object, role, group, and user are allowed; all other operations are not allowed. If a
|
|
group or role is defined multiple times in the policy file, the last definition takes precedence.
|
|
</p>
|
|
|
|
<p>
|
|
To understand the notion of whitelisting, set up a minimal policy file that does not provide any
|
|
privileges for any object. When you connect to an Impala node where this policy file is in effect, you
|
|
get no results for <codeph>SHOW DATABASES</codeph>, and an error when you issue any <codeph>SHOW
|
|
TABLES</codeph>, <codeph>USE <varname>database_name</varname></codeph>, <codeph>DESCRIBE
|
|
<varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and or other statements that expect to
|
|
access databases or tables, even if the corresponding databases and tables exist.
|
|
</p>
|
|
|
|
<p>
|
|
The contents of the policy file are cached, to avoid a performance penalty for each query. The policy
|
|
file is re-checked by each <cmdname>impalad</cmdname> node every 5 minutes. When you make a
|
|
non-time-sensitive change such as adding new privileges or new users, you can let the change take effect
|
|
automatically a few minutes later. If you remove or reduce privileges, and want the change to take effect
|
|
immediately, restart the <cmdname>impalad</cmdname> daemon on all nodes, again specifying the
|
|
<codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> options so that the rules
|
|
from the updated policy file are applied.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="security_examples">
|
|
|
|
<title>Examples of Policy File Rules for Security Scenarios</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The following examples show rules that might go in the policy file to deal with various
|
|
authorization-related scenarios. For illustration purposes, this section shows several very small policy
|
|
files with only a few rules each. In your environment, typically you would define many roles to cover all
|
|
the scenarios involving your own databases, tables, and applications, and a smaller number of groups,
|
|
whose members are given the privileges from one or more roles.
|
|
</p>
|
|
|
|
<example id="sec_ex_unprivileged">
|
|
|
|
<title>A User with No Privileges</title>
|
|
|
|
<p>
|
|
If a user has no privileges at all, that user cannot access any schema objects in the system. The error
|
|
messages do not disclose the names or existence of objects that the user is not authorized to read.
|
|
</p>
|
|
|
|
<p>
|
|
<!-- This example demonstrates the lack of privileges using a blank policy file, so no users have any privileges. -->
|
|
This is the experience you want a user to have if they somehow log into a system where they are not an
|
|
authorized Impala user. In a real deployment with a filled-in policy file, a user might have no
|
|
privileges because they are not a member of any of the relevant groups mentioned in the policy file.
|
|
</p>
|
|
|
|
<!-- Have the raw material but not formatted into easily digestible example. Do for first 1.1 doc refresh.
|
|
<codeblock></codeblock> -->
|
|
|
|
</example>
|
|
|
|
<example id="sec_ex_superuser">
|
|
|
|
<title>Examples of Privileges for Administrative Users</title>
|
|
|
|
<p>
|
|
When an administrative user has broad access to tables or databases, the associated rules in the
|
|
<codeph>[roles]</codeph> section typically use wildcards and/or inheritance. For example, in the
|
|
following sample policy file, <codeph>db=*</codeph> refers to all databases and
|
|
<codeph>db=*->table=*</codeph> refers to all tables in all databases.
|
|
</p>
|
|
|
|
<p>
|
|
Omitting the rightmost portion of a rule means that the privileges apply to all the objects that could
|
|
be specified there. For example, in the following sample policy file, the
|
|
<codeph>all_databases</codeph> role has all privileges for all tables in all databases, while the
|
|
<codeph>one_database</codeph> role has all privileges for all tables in one specific database. The
|
|
<codeph>all_databases</codeph> role does not grant privileges on URIs, so a group with that role could
|
|
not issue a <codeph>CREATE TABLE</codeph> statement with a <codeph>LOCATION</codeph> clause. The
|
|
<codeph>entire_server</codeph> role has all privileges on both databases and URIs within the server.
|
|
</p>
|
|
|
|
<codeblock>[groups]
|
|
supergroup = all_databases
|
|
|
|
[roles]
|
|
read_all_tables = server=server1->db=*->table=*->action=SELECT
|
|
all_tables = server=server1->db=*->table=*
|
|
all_databases = server=server1->db=*
|
|
one_database = server=server1->db=test_db
|
|
entire_server = server=server1
|
|
</codeblock>
|
|
|
|
</example>
|
|
|
|
<example id="sec_ex_detailed">
|
|
|
|
<title>A User with Privileges for Specific Databases and Tables</title>
|
|
|
|
<p>
|
|
If a user has privileges for specific tables in specific databases, the user can access those things
|
|
but nothing else. They can see the tables and their parent databases in the output of <codeph>SHOW
|
|
TABLES</codeph> and <codeph>SHOW DATABASES</codeph>, <codeph>USE</codeph> the appropriate databases,
|
|
and perform the relevant actions (<codeph>SELECT</codeph> and/or <codeph>INSERT</codeph>) based on the
|
|
table privileges. To actually create a table requires the <codeph>ALL</codeph> privilege at the
|
|
database level, so you might define separate roles for the user that sets up a schema and other users
|
|
or applications that perform day-to-day operations on the tables.
|
|
</p>
|
|
|
|
<p>
|
|
The following sample policy file shows some of the syntax that is appropriate as the policy file grows,
|
|
such as the <codeph>#</codeph> comment syntax, <codeph>\</codeph> continuation syntax, and comma
|
|
separation for roles assigned to groups or privileges assigned to roles.
|
|
</p>
|
|
|
|
<codeblock>[groups]
|
|
cloudera = training_sysadmin, instructor
|
|
visitor = student
|
|
|
|
[roles]
|
|
training_sysadmin = server=server1->db=training, \
|
|
server=server1->db=instructor_private, \
|
|
server=server1->db=lesson_development
|
|
instructor = server=server1->db=training->table=*->action=*, \
|
|
server=server1->db=instructor_private->table=*->action=*, \
|
|
server=server1->db=lesson_development->table=lesson*
|
|
# This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP.
|
|
student = server=server1->db=training->table=lesson_*->action=SELECT
|
|
</codeblock>
|
|
|
|
</example>
|
|
|
|
<!--
|
|
<example id="sec_ex_superuser_single_db">
|
|
<title>A User with Full Privileges for a Specific Database</title>
|
|
<p>
|
|
|
|
</p>
|
|
<codeblock></codeblock>
|
|
</example>
|
|
|
|
<example id="sec_ex_readonly_single_db">
|
|
<title>A User with Read-Only Privileges for a Specific Database</title>
|
|
<p>
|
|
|
|
</p>
|
|
<codeblock></codeblock>
|
|
<p>
|
|
If a user has <codeph>SELECT</codeph> privilege for a database, they can issue a <codeph>USE</codeph> statement
|
|
for that database. Whether or not they can access tables within the database depends on further privileges
|
|
defined at the table level.
|
|
</p>
|
|
|
|
<codeblock></codeblock>
|
|
|
|
</example>
|
|
|
|
<example id="sec_ex_superuser_single_table">
|
|
<title>A User with Full Privileges for a Specific Table</title>
|
|
<p>
|
|
If a user has <codeph>SELECT</codeph> privilege for a table, they can query, describe, or explain queries for
|
|
that table.
|
|
</p>
|
|
|
|
<codeblock></codeblock>
|
|
</example>
|
|
|
|
<example id="sec_ex_load_data">
|
|
<title>A User with Privileges to Load Data but not Read Data</title>
|
|
|
|
<p>
|
|
If a user has <codeph>INSERT</codeph> privilege for a table, they can write to the table if it already exists.
|
|
They cannot create or alter the table; those operations require the <codeph>ALL</codeph> privilege.
|
|
</p>
|
|
<codeblock></codeblock>
|
|
</example>
|
|
-->
|
|
|
|
<example id="sec_ex_external_files">
|
|
|
|
<title>Privileges for Working with External Data Files</title>
|
|
|
|
<p>
|
|
When data is being inserted through the <codeph>LOAD DATA</codeph> statement, or is referenced from an
|
|
HDFS location outside the normal Impala database directories, the user also needs appropriate
|
|
permissions on the URIs corresponding to those HDFS locations.
|
|
</p>
|
|
|
|
<p>
|
|
In this sample policy file:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
The <codeph>external_table</codeph> role lets us insert into and query the Impala table,
|
|
<codeph>external_table.sample</codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
The <codeph>staging_dir</codeph> role lets us specify the HDFS path
|
|
<filepath>/user/cloudera/external_data</filepath> with the <codeph>LOAD DATA</codeph> statement.
|
|
Remember, when Impala queries or loads data files, it operates on all the files in that directory,
|
|
not just a single file, so any Impala <codeph>LOCATION</codeph> parameters refer to a directory
|
|
rather than an individual file.
|
|
</li>
|
|
|
|
<li>
|
|
We included the IP address and port of the Hadoop name node in the HDFS URI of the
|
|
<codeph>staging_dir</codeph> rule. We found those details in
|
|
<filepath>/etc/hadoop/conf/core-site.xml</filepath>, under the <codeph>fs.default.name</codeph>
|
|
element. That is what we use in any roles that specify URIs (that is, the locations of directories in
|
|
HDFS).
|
|
</li>
|
|
|
|
<li>
|
|
We start this example after the table <codeph>external_table.sample</codeph> is already created. In
|
|
the policy file for the example, we have already taken away the <codeph>external_table_admin</codeph>
|
|
role from the <codeph>cloudera</codeph> group, and replaced it with the lesser-privileged
|
|
<codeph>external_table</codeph> role.
|
|
</li>
|
|
|
|
<li>
|
|
We assign privileges to a subdirectory underneath <filepath>/user/cloudera</filepath> in HDFS,
|
|
because such privileges also apply to any subdirectories underneath. If we had assigned privileges to
|
|
the parent directory <filepath>/user/cloudera</filepath>, it would be too likely to mess up other
|
|
files by specifying a wrong location by mistake.
|
|
</li>
|
|
|
|
<li>
|
|
The <codeph>cloudera</codeph> under the <codeph>[groups]</codeph> section refers to the
|
|
<codeph>cloudera</codeph> group. (In the demo VM used for this example, there is a
|
|
<codeph>cloudera</codeph> user that is a member of a <codeph>cloudera</codeph> group.)
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
Policy file:
|
|
</p>
|
|
|
|
<codeblock>[groups]
|
|
cloudera = external_table, staging_dir
|
|
|
|
[roles]
|
|
external_table_admin = server=server1->db=external_table
|
|
external_table = server=server1->db=external_table->table=sample->action=*
|
|
staging_dir = server=server1->uri=hdfs://127.0.0.1:8020/user/cloudera/external_data->action=*
|
|
</codeblock>
|
|
|
|
<p>
|
|
<cmdname>impala-shell</cmdname> session:
|
|
</p>
|
|
|
|
<codeblock>[localhost:21000] > use external_table;
|
|
Query: use external_table
|
|
[localhost:21000] > show tables;
|
|
Query: show tables
|
|
Query finished, fetching results ...
|
|
+--------+
|
|
| name |
|
|
+--------+
|
|
| sample |
|
|
+--------+
|
|
Returned 1 row(s) in 0.02s
|
|
|
|
[localhost:21000] > select * from sample;
|
|
Query: select * from sample
|
|
Query finished, fetching results ...
|
|
+-----+
|
|
| x |
|
|
+-----+
|
|
| 1 |
|
|
| 5 |
|
|
| 150 |
|
|
+-----+
|
|
Returned 3 row(s) in 1.04s
|
|
|
|
[localhost:21000] > load data inpath '/user/cloudera/external_data' into table sample;
|
|
Query: load data inpath '/user/cloudera/external_data' into table sample
|
|
Query finished, fetching results ...
|
|
+----------------------------------------------------------+
|
|
| summary |
|
|
+----------------------------------------------------------+
|
|
| Loaded 1 file(s). Total files in destination location: 2 |
|
|
+----------------------------------------------------------+
|
|
Returned 1 row(s) in 0.26s
|
|
[localhost:21000] > select * from sample;
|
|
Query: select * from sample
|
|
Query finished, fetching results ...
|
|
+-------+
|
|
| x |
|
|
+-------+
|
|
| 2 |
|
|
| 4 |
|
|
| 6 |
|
|
| 8 |
|
|
| 64738 |
|
|
| 49152 |
|
|
| 1 |
|
|
| 5 |
|
|
| 150 |
|
|
+-------+
|
|
Returned 9 row(s) in 0.22s
|
|
|
|
[localhost:21000] > load data inpath '/user/cloudera/unauthorized_data' into table sample;
|
|
Query: load data inpath '/user/cloudera/unauthorized_data' into table sample
|
|
ERROR: AuthorizationException: User 'cloudera' does not have privileges to access: hdfs://127.0.0.1:8020/user/cloudera/unauthorized_data
|
|
</codeblock>
|
|
|
|
</example>
|
|
|
|
<example audience="Cloudera" id="sec_ex_views" rev="2.3.0 collevelauth">
|
|
|
|
<title>Controlling Access at the Column Level through Views</title>
|
|
|
|
<p>
|
|
If a user has <codeph>SELECT</codeph> privilege for a view, they can query the view, even if they do
|
|
not have any privileges on the underlying table. To see the details about the underlying table through
|
|
<codeph>EXPLAIN</codeph> or <codeph>DESCRIBE FORMATTED</codeph> statements on the view, the user must
|
|
also have <codeph>SELECT</codeph> privilege for the underlying table.
|
|
</p>
|
|
|
|
<note type="important">
|
|
<p>
|
|
The types of data that are considered sensitive and confidential differ depending on the jurisdiction
|
|
the type of industry, or both. For fine-grained access controls, set up appropriate privileges based
|
|
on all applicable laws and regulations.
|
|
</p>
|
|
<p>
|
|
Be careful using the <codeph>ALTER VIEW</codeph> statement to point an existing view at a different
|
|
base table or a new set of columns that includes sensitive or restricted data. Make sure that any
|
|
users who have <codeph>SELECT</codeph> privilege on the view do not gain access to any additional
|
|
information they are not authorized to see.
|
|
</p>
|
|
</note>
|
|
|
|
<p>
|
|
The following example shows how a system administrator could set up a table containing some columns
|
|
with sensitive information, then create a view that only exposes the non-confidential columns.
|
|
</p>
|
|
|
|
<codeblock>[localhost:21000] > create table sensitive_info
|
|
> (
|
|
> name string,
|
|
> address string,
|
|
> credit_card string,
|
|
> taxpayer_id string
|
|
> );
|
|
[localhost:21000] > create view name_address_view as select name, address from sensitive_info;
|
|
</codeblock>
|
|
|
|
<p>
|
|
Then the following policy file specifies read-only privilege for that view, without authorizing access
|
|
to the underlying table:
|
|
</p>
|
|
|
|
<codeblock>[groups]
|
|
cloudera = view_only_privs
|
|
|
|
[roles]
|
|
view_only_privs = server=server1->db=reports->table=name_address_view->action=SELECT
|
|
</codeblock>
|
|
|
|
<p>
|
|
Thus, a user with the <codeph>view_only_privs</codeph> role could access through Impala queries the
|
|
basic information but not the sensitive information, even if both kinds of information were part of the
|
|
same data file.
|
|
</p>
|
|
|
|
<p>
|
|
You might define other views to allow users from different groups to query different sets of columns.
|
|
</p>
|
|
|
|
</example>
|
|
|
|
<example id="sec_sysadmin">
|
|
|
|
<title>Separating Administrator Responsibility from Read and Write Privileges</title>
|
|
|
|
<p>
|
|
Remember that to create a database requires full privilege on that database, while day-to-day
|
|
operations on tables within that database can be performed with lower levels of privilege on specific
|
|
table. Thus, you might set up separate roles for each database or application: an administrative one
|
|
that could create or drop the database, and a user-level one that can access only the relevant tables.
|
|
</p>
|
|
|
|
<p>
|
|
For example, this policy file divides responsibilities between users in 3 different groups:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
Members of the <codeph>supergroup</codeph> group have the <codeph>training_sysadmin</codeph> role and
|
|
so can set up a database named <codeph>training</codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
Members of the <codeph>cloudera</codeph> group have the <codeph>instructor</codeph> role and so can
|
|
create, insert into, and query any tables in the <codeph>training</codeph> database, but cannot
|
|
create or drop the database itself.
|
|
</li>
|
|
|
|
<li>
|
|
Members of the <codeph>visitor</codeph> group have the <codeph>student</codeph> role and so can query
|
|
those tables in the <codeph>training</codeph> database.
|
|
</li>
|
|
</ul>
|
|
|
|
<codeblock>[groups]
|
|
supergroup = training_sysadmin
|
|
cloudera = instructor
|
|
visitor = student
|
|
|
|
[roles]
|
|
training_sysadmin = server=server1->db=training
|
|
instructor = server=server1->db=training->table=*->action=*
|
|
student = server=server1->db=training->table=*->action=SELECT
|
|
</codeblock>
|
|
|
|
</example>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="security_multiple_policy_files">
|
|
|
|
<title>Using Multiple Policy Files for Different Databases</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
For an Impala cluster with many databases being accessed by many users and applications, it might be
|
|
cumbersome to update the security policy file for each privilege change or each new database, table, or
|
|
view. You can allow security to be managed separately for individual databases, by setting up a separate
|
|
policy file for each database:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
Add the optional <codeph>[databases]</codeph> section to the main policy file.
|
|
</li>
|
|
|
|
<li>
|
|
Add entries in the <codeph>[databases]</codeph> section for each database that has its own policy file.
|
|
</li>
|
|
|
|
<li>
|
|
For each listed database, specify the HDFS path of the appropriate policy file.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
For example:
|
|
</p>
|
|
|
|
<codeblock>[databases]
|
|
# Defines the location of the per-DB policy files for the 'customers' and 'sales' databases.
|
|
customers = hdfs://ha-nn-uri/etc/access/customers.ini
|
|
sales = hdfs://ha-nn-uri/etc/access/sales.ini
|
|
</codeblock>
|
|
|
|
<p>
|
|
To enable URIs in per-DB policy files, add the following string in the Cloudera Manager field
|
|
<uicontrol>Impala Service Environment Advanced Configuration Snippet (Safety Valve)</uicontrol>:
|
|
</p>
|
|
|
|
<codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true"
|
|
</codeblock>
|
|
|
|
<note type="important">
|
|
Enabling URIs in per-DB policy files introduces a security risk by allowing the owner of the db-level
|
|
policy file to grant himself/herself load privileges to anything the <codeph>impala</codeph> user has
|
|
read permissions for in HDFS (including data in other databases controlled by different db-level policy
|
|
files).
|
|
</note>
|
|
</conbody>
|
|
</concept>
|
|
</concept>
|
|
|
|
<concept id="security_schema">
|
|
|
|
<title>Setting Up Schema Objects for a Secure Impala Deployment</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Remember that in your role definitions, you specify privileges at the level of individual databases and
|
|
tables, or all databases or all tables within a database. To simplify the structure of these rules, plan
|
|
ahead of time how to name your schema objects so that data with different authorization requirements is
|
|
divided into separate databases.
|
|
</p>
|
|
|
|
<p>
|
|
If you are adding security on top of an existing Impala deployment, remember that you can rename tables or
|
|
even move them between databases using the <codeph>ALTER TABLE</codeph> statement. In Impala, creating new
|
|
databases is a relatively inexpensive operation, basically just creating a new directory in HDFS.
|
|
</p>
|
|
|
|
<p>
|
|
You can also plan the security scheme and set up the policy file before the actual schema objects named in
|
|
the policy file exist. Because the authorization capability is based on whitelisting, a user can only
|
|
create a new database or table if the required privilege is already in the policy file: either by listing
|
|
the exact name of the object being created, or a <codeph>*</codeph> wildcard to match all the applicable
|
|
objects within the appropriate container.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="security_privileges">
|
|
|
|
<title>Privilege Model and Object Hierarchy</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Privileges can be granted on different objects in the schema. Any privilege that can be granted is
|
|
associated with a level in the object hierarchy. If a privilege is granted on a container object in the
|
|
hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other
|
|
database systems such as MySQL.
|
|
</p>
|
|
|
|
<p>
|
|
The kinds of objects in the schema hierarchy are:
|
|
</p>
|
|
|
|
<codeblock>Server
|
|
URI
|
|
Database
|
|
Table
|
|
</codeblock>
|
|
|
|
<p>
|
|
The server name is specified by the <codeph>-server_name</codeph> option when <cmdname>impalad</cmdname>
|
|
starts. Specify the same name for all <cmdname>impalad</cmdname> nodes in the cluster.
|
|
</p>
|
|
|
|
<p>
|
|
URIs represent the HDFS paths you specify as part of statements such as <codeph>CREATE EXTERNAL
|
|
TABLE</codeph> and <codeph>LOAD DATA</codeph>. Typically, you specify what look like UNIX paths, but these
|
|
locations can also be prefixed with <codeph>hdfs://</codeph> to make clear that they are really URIs. To
|
|
set privileges for a URI, specify the name of a directory, and the privilege applies to all the files in
|
|
that directory and any directories underneath it.
|
|
</p>
|
|
|
|
<p rev="2.3.0 collevelauth">
|
|
In <keyword keyref="impala23_full"/> and higher, you can specify privileges for individual columns,
|
|
as described in
|
|
<xref audience="integrated" href="sg_hive_sql.xml#concept_c2q_4qx_p4/col_level_auth_sentry"/><xref audience="standalone" href="https://www.cloudera.com/documentation/enterprise/latest/topics/sg_hive_sql.html" format="html" scope="external"/>.
|
|
Formerly, to specify
|
|
read privileges at this level, you created a view that queried specific columns and/or partitions from a base
|
|
table, and gave <codeph>SELECT</codeph> privilege on the view but not the underlying table.
|
|
</p>
|
|
|
|
<p>
|
|
URIs must start with either <codeph>hdfs://</codeph> or <codeph>file://</codeph>. If a URI starts with
|
|
anything else, it will cause an exception and the policy file will be invalid. When defining URIs for HDFS,
|
|
you must also specify the NameNode. For example:
|
|
<codeblock>data_read = server=server1->uri=file:///path/to/dir, \
|
|
server=server1->uri=hdfs://namenode:port/path/to/dir
|
|
</codeblock>
|
|
<note type="warning">
|
|
<p>
|
|
Because the NameNode host and port must be specified, Cloudera strongly recommends you use High
|
|
Availability (HA). This ensures that the URI will remain constant even if the namenode changes.
|
|
</p>
|
|
<codeblock>data_read = server=server1->uri=file:///path/to/dir,\ server=server1->uri=hdfs://ha-nn-uri/path/to/dir
|
|
</codeblock>
|
|
</note>
|
|
</p>
|
|
|
|
<!-- Experiment with replacing my original copied table with a conref'ed version of Ambreen's from the Security Guide. -->
|
|
|
|
<!--
|
|
<table>
|
|
<title>Sentry privilege types and objects they apply to</title>
|
|
<tgroup cols="2">
|
|
<colspec colnum="1" colname="col1"/>
|
|
<colspec colnum="2" colname="col2"/>
|
|
<tbody>
|
|
<row>
|
|
<entry>Privilege</entry>
|
|
<entry>Object</entry>
|
|
</row>
|
|
<row>
|
|
<entry>INSERT</entry>
|
|
<entry>TABLE, URI</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SELECT</entry>
|
|
<entry>TABLE, VIEW, URI</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALL</entry>
|
|
<entry>SERVER, DB, URI</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
-->
|
|
|
|
<table conref="../shared/impala_common.xml#common/sentry_privileges_objects">
|
|
<tgroup cols="2">
|
|
<colspec colnum="1" colname="col1" colwidth="1*"/>
|
|
<tbody>
|
|
<row>
|
|
<entry/>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<note>
|
|
<p>
|
|
Although this document refers to the <codeph>ALL</codeph> privilege, currently if you use the policy file
|
|
mode, you do not use the actual keyword <codeph>ALL</codeph> in the policy file. When you code role
|
|
entries in the policy file:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a server, use a role like
|
|
<codeph>server=<varname>server_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a database, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a table, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=*</codeph>.
|
|
</li>
|
|
</ul>
|
|
</note>
|
|
|
|
<draft-comment author="ambreen.kazi" translate="no">The Privilege Model lives at:
|
|
https://wiki.cloudera.com/pages/viewpage.action?pageId=24919851</draft-comment>
|
|
|
|
<table>
|
|
<tgroup cols="4">
|
|
<colspec colnum="1" colname="col1" colwidth="1.31*"/>
|
|
<colspec colnum="2" colname="col2" colwidth="1.17*"/>
|
|
<colspec colnum="3" colname="col3" colwidth="1*"/>
|
|
<colspec colname="newCol4" colnum="4" colwidth="1*"/>
|
|
<thead>
|
|
<row>
|
|
<entry>
|
|
Operation
|
|
</entry>
|
|
<entry>
|
|
Scope
|
|
</entry>
|
|
<entry>
|
|
Privileges
|
|
</entry>
|
|
<entry>
|
|
URI
|
|
</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row conref="../shared/impala_common.xml#common/explain_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/load_data_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/create_database_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/drop_database_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/create_table_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/drop_table_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/describe_table_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_add_columns_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_replace_columns_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_change_column_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_rename_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_set_tblproperties_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_set_fileformat_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_set_location_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_add_partition_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_add_partition_location_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_drop_partition_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_partition_set_fileformat_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_set_serdeproperties_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/create_view_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/drop_view_privs">
|
|
<entry/>
|
|
</row>
|
|
<row id="alter_view_privs">
|
|
<entry>
|
|
ALTER VIEW
|
|
</entry>
|
|
<entry rev="2.3.0 collevelauth">
|
|
You need <codeph>ALL</codeph> privilege on the named view <ph rev="1.4.0">and the parent
|
|
database</ph>, plus <codeph>SELECT</codeph> privilege for any tables or views referenced by the
|
|
view query. Once the view is created or altered by a high-privileged system administrator, it can
|
|
be queried by a lower-privileged user who does not have full query privileges for the base tables.
|
|
</entry>
|
|
<entry>
|
|
ALL, SELECT
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/alter_table_set_location_privs">
|
|
<entry/>
|
|
</row>
|
|
<row id="create_external_table_privs">
|
|
<entry>
|
|
CREATE EXTERNAL TABLE
|
|
</entry>
|
|
<entry>
|
|
Database (ALL), URI (SELECT)
|
|
</entry>
|
|
<entry>
|
|
ALL, SELECT
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/select_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/use_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/create_function_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/drop_function_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/refresh_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/invalidate_metadata_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/invalidate_metadata_table_privs">
|
|
<entry/>
|
|
</row>
|
|
<row conref="../shared/impala_common.xml#common/compute_stats_privs">
|
|
<entry/>
|
|
</row>
|
|
<row id="show_table_stats_privs">
|
|
<entry>
|
|
SHOW TABLE STATS, SHOW PARTITIONS
|
|
</entry>
|
|
<entry>
|
|
TABLE
|
|
</entry>
|
|
<entry>
|
|
SELECT/INSERT
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row>
|
|
<entry id="show_column_stats_privs">
|
|
SHOW COLUMN STATS
|
|
</entry>
|
|
<entry>
|
|
TABLE
|
|
</entry>
|
|
<entry>
|
|
SELECT/INSERT
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row>
|
|
<entry id="show_functions_privs">
|
|
SHOW FUNCTIONS
|
|
</entry>
|
|
<entry>
|
|
DATABASE
|
|
</entry>
|
|
<entry>
|
|
SELECT
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row id="show_tables_privs">
|
|
<entry>
|
|
SHOW TABLES
|
|
</entry>
|
|
<entry/>
|
|
<entry>
|
|
No special privileges needed to issue the statement, but only shows objects you are authorized for
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
<row id="show_databases_privs">
|
|
<entry>
|
|
SHOW DATABASES, SHOW SCHEMAS
|
|
</entry>
|
|
<entry/>
|
|
<entry>
|
|
No special privileges needed to issue the statement, but only shows objects you are authorized for
|
|
</entry>
|
|
<entry/>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<!-- Original table, now replaced by content conref'ed from Security Guide
|
|
<table>
|
|
<title>Sentry privileges for Impala SQL operations</title>
|
|
<tgroup cols="3">
|
|
<colspec colnum="1" colname="col1" colwidth="*"/>
|
|
<colspec colnum="2" colname="col2" colwidth="*"/>
|
|
<colspec colnum="3" colname="col3" colwidth="4*"/>
|
|
<thead>
|
|
<row>
|
|
<entry>Impala SQL Operation</entry>
|
|
<entry>Privileges Required</entry>
|
|
<entry>Object on which Privileges Required</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>EXPLAIN</entry>
|
|
<entry>SELECT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>LOAD DATA</entry>
|
|
<entry>INSERT, SELECT</entry>
|
|
<entry>Table (INSERT) and URI (ALL); write privilege is required on the URI because the data files are physically moved from there</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE DATABASE</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Server</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>DROP DATABASE</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Server</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE TABLE</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Database</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>DROP TABLE</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Database</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>DESCRIBE TABLE</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Database</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW DATABASES</entry>
|
|
<entry>Any privilege</entry>
|
|
<entry>Any object in the database; only databases for which the user has privileges on some object are shown</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW TABLES</entry>
|
|
<entry>Any privilege</entry>
|
|
<entry>Any table in the database; only tables for which the user has privileges are shown</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE VIEW</entry>
|
|
<entry>ALL, SELECT</entry>
|
|
<entry rev="2.3.0 collevelauth">
|
|
You need <codeph>ALL</codeph> privilege on the named view <ph rev="1.4.0">and the parent database</ph>, plus <codeph>SELECT</codeph> privilege for any tables or views referenced by the view query.
|
|
Once the view is created or altered by a high-privileged system administrator,
|
|
it can be queried by a lower-privileged user who does not have full query privileges for the base tables.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry>DROP VIEW</entry>
|
|
<entry>ALL</entry>
|
|
<entry><ph rev="1.4.0">Database</ph></entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER VIEW</entry>
|
|
<entry>ALL, SELECT</entry>
|
|
<entry rev="2.3.0 collevelauth">
|
|
You need <codeph>ALL</codeph> privilege on the named view <ph rev="1.4.0">and the parent database</ph>, plus <codeph>SELECT</codeph> privilege for any tables or views referenced by the view query.
|
|
Once the view is created or altered by a high-privileged system administrator,
|
|
it can be queried by a lower-privileged user who does not have full query privileges for the base tables.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE LOCATION</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Table, URI</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE EXTERNAL TABLE</entry>
|
|
<entry>ALL, SELECT</entry>
|
|
<entry>Database (ALL), URI (SELECT)</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SELECT</entry>
|
|
<entry>SELECT</entry>
|
|
<entry rev="2.3.0 collevelauth">Table, View; you can have SELECT privilege for a view without having SELECT privilege
|
|
for the underlying tables. See
|
|
<xref audience="integrated" href="sg_hive_sql.xml#concept_c2q_4qx_p4/col_level_auth_sentry"/><xref audience="standalone" href="https://www.cloudera.com/documentation/enterprise/latest/topics/sg_hive_sql.html" format="html" scope="external"/>
|
|
for details about column-level authorization.
|
|
Before column-level authorization was added in <keyword keyref="impala23_full"/> the way to implement
|
|
column-level security was by creating views that referenced particular sets of columns.</entry>
|
|
</row>
|
|
<row>
|
|
<entry>USE</entry>
|
|
<entry>Any privilege</entry>
|
|
<entry>Any object in the database</entry>
|
|
</row>
|
|
<row rev="1.2">
|
|
<entry>CREATE FUNCTION</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Server</entry>
|
|
</row>
|
|
<row rev="1.2">
|
|
<entry>DROP FUNCTION</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Server</entry>
|
|
</row>
|
|
<row>
|
|
<entry>REFRESH</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>INVALIDATE METADATA</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Server</entry>
|
|
</row>
|
|
<row rev="1.2.2">
|
|
<entry>COMPUTE STATS</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW TABLE STATS, SHOW PARTITIONS</entry>
|
|
<entry>?</entry>
|
|
<entry>?</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW COLUMN STATS</entry>
|
|
<entry>?</entry>
|
|
<entry>?</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW FUNCTIONS</entry>
|
|
<entry>?</entry>
|
|
<entry>?</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW DATABASES, SHOW SCHEMAS</entry>
|
|
<entry>?</entry>
|
|
<entry>?</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
-->
|
|
|
|
<!-- Here are pieces from the original Hive table that do not apply to Impala because of differences in the SQL syntax.
|
|
Might resurrect some of these later as features are added to Impala, in particular CREATE TABLE AS SELECT.
|
|
<row>
|
|
<entry>EXPORT</entry>
|
|
<entry>SELECT, INSERT@URI</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>IMPORT</entry>
|
|
<entry>ALL, SELECT@URI</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ANALYZE</entry>
|
|
<entry>INSERT, SELECT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SWITCH DATABASE</entry>
|
|
<entry>Any</entry>
|
|
<entry>Any Table, View in the DB</entry>
|
|
</row>
|
|
<row>
|
|
<entry>DESCRIBE FUNCTION*</entry>
|
|
<entry>Not Restricted</entry>
|
|
</row>
|
|
<row>
|
|
<entry>MSCK</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW COLUMNS</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW TABLE STATUS</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW TABLE PROPERTIES</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW CREATE TABLE</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW FUNCTIONS*</entry>
|
|
<entry>Not restricted</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW PARTITIONS</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW INDEXES</entry>
|
|
<entry>SELECT or INSERT</entry>
|
|
<entry>Table</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW LOCKS</entry>
|
|
<entry>Not Restricted</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE FUNCTION</entry>
|
|
<entry>Any Privilege</entry>
|
|
<entry>Any Object</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE INDEX</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>DROP FUNCTION</entry>
|
|
<entry>Any Privilege</entry>
|
|
<entry>Any Object</entry>
|
|
</row>
|
|
<row>
|
|
<entry>DROP INDEX</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER INDEX REBUILD</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER VIEW PROPERTIES</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>GRANT PRIVILEGE</entry>
|
|
<entry>Allowed, but has no effect on HS2 auth</entry>
|
|
</row>
|
|
<row>
|
|
<entry>REVOKE PRIVILEGE</entry>
|
|
<entry>Allowed, but has no effect on HS2 auth</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHOW GRANTS</entry>
|
|
<entry>Allowed, but has no effect on HS2 auth</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE PROTECT MODE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE FILE FORMAT</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER PARTITION PROTECT MODE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER PARTITION FILE FORMAT</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER PARTITION LOCATION*</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Server</entry>
|
|
</row>
|
|
<row>
|
|
<entry>CREATE TABLE AS SELECT</entry>
|
|
<entry>ALL, SELECT</entry>
|
|
<entry>Database, Table/View</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER INDEX PROP</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER DATABASE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>DESCRIBE DATABASE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE MERGE FILE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER PARTITION MERGE FILE</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE SKEWED</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ALTER TABLE PARTITION SKEWED LOCATION</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Database</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ADD JAR*</entry>
|
|
<entry>Restricted unless hive.server2.authorization.external.exec = true</entry>
|
|
</row>
|
|
<row>
|
|
<entry>TRANSFORM*</entry>
|
|
<entry>ALL</entry>
|
|
<entry>Server</entry>
|
|
</row>
|
|
-->
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="sentry_debug">
|
|
|
|
<title><ph conref="../shared/impala_common.xml#common/title_sentry_debug"/></title>
|
|
|
|
<conbody>
|
|
|
|
<p conref="../shared/impala_common.xml#common/sentry_debug"/>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="sentry_cm">
|
|
|
|
<title>Managing Sentry for Impala through Cloudera Manager</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
To enable the Sentry service for Impala and Hive, set <uicontrol>Hive/Impala > Service-Wide > Sentry
|
|
Service</uicontrol> parameter to the Sentry service. Then restart Impala and Hive. Simply adding Sentry
|
|
service as a dependency and restarting enables Impala and Hive to use the Sentry service.
|
|
</p>
|
|
|
|
<p>
|
|
To set the server name to use when granting server level privileges, set the <uicontrol>Hive >
|
|
Service-Wide > Advanced > Server Name for Sentry Authorization</uicontrol> parameter. When using
|
|
Sentry with the Hive Metastore, you can specify the list of users that are allowed to bypass Sentry
|
|
Authorization in Hive Metastore using <uicontrol>Hive > Service-Wide > Security > Bypass Sentry
|
|
Authorization Users</uicontrol>. These are usually service users that already ensure all activity has been
|
|
authorized.
|
|
</p>
|
|
|
|
<note>
|
|
The <uicontrol>Hive/Impala > Service-Wide > Policy File Based Sentry</uicontrol> tab contains
|
|
parameters only relevant to configuring Sentry using policy files. In particular, make sure that
|
|
<uicontrol>Enable Sentry Authorization using Policy Files</uicontrol> parameter is unchecked when using the
|
|
Sentry service. Cloudera Manager throws a validation error if you attempt to configure the Sentry service
|
|
and policy file at the same time.
|
|
</note>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="sec_ex_default">
|
|
|
|
<title>The DEFAULT Database in a Secure Deployment</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Because of the extra emphasis on granular access controls in a secure deployment, you should move any
|
|
important or sensitive information out of the <codeph>DEFAULT</codeph> database into a named database whose
|
|
privileges are specified in the policy file. Sometimes you might need to give privileges on the
|
|
<codeph>DEFAULT</codeph> database for administrative reasons; for example, as a place you can reliably
|
|
specify with a <codeph>USE</codeph> statement when preparing to drop a database.
|
|
</p>
|
|
|
|
<!-- Maybe have an example later, but not for initial 1.1 release.
|
|
<codeblock></codeblock>
|
|
-->
|
|
</conbody>
|
|
</concept>
|
|
</concept>
|