mirror of
https://github.com/apache/impala.git
synced 2026-02-03 09:00:39 -05:00
The patch also fixes broken links in the authorization doc. Change-Id: I9bf6109636e44ca514cfe74fb565f7c506ec0708 Reviewed-on: http://gerrit.cloudera.org:8080/10975 Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
534 lines
27 KiB
XML
534 lines
27 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept rev="1.1" id="authorization">
|
|
|
|
<title>Enabling Sentry Authorization for Impala</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Security"/>
|
|
<data name="Category" value="Sentry"/>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Configuring"/>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
<data name="Category" value="Users"/>
|
|
<data name="Category" value="Groups"/>
|
|
<data name="Category" value="Administrators"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody id="sentry">
|
|
|
|
<p>
|
|
Authorization determines which users are allowed to access which resources, and what operations they are
|
|
allowed to perform. In Impala 1.1 and higher, you use Apache Sentry for
|
|
authorization. Sentry adds a fine-grained authorization framework for Hadoop. By default (when authorization
|
|
is not enabled), Impala does all read and write operations with the privileges of the <codeph>impala</codeph>
|
|
user, which is suitable for a development/test environment but not for a secure production environment. When
|
|
authorization is enabled, Impala uses the OS user ID of the user who runs <cmdname>impala-shell</cmdname> or
|
|
other client program, and associates various privileges with each user.
|
|
</p>
|
|
|
|
<note>
|
|
Sentry is typically used in conjunction with Kerberos authentication, which defines which hosts are allowed
|
|
to connect to each server. Using the combination of Sentry and Kerberos prevents malicious users from being
|
|
able to connect by creating a named account on an untrusted machine. See
|
|
<xref href="impala_kerberos.xml#kerberos"/> for details about Kerberos authentication.
|
|
</note>
|
|
|
|
<p audience="PDF" outputclass="toc inpage">
|
|
See the following sections for details about using the Impala authorization features:
|
|
</p>
|
|
</conbody>
|
|
|
|
<concept id="sentry_priv_model">
|
|
|
|
<title>The Sentry Privilege Model</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Privileges can be granted on different objects in the schema. Any privilege that can be
|
|
granted is associated with a level in the object hierarchy. If a privilege is granted on
|
|
a parent object in the hierarchy, the child object automatically inherits it. This is
|
|
the same privilege model as Hive and other database systems.
|
|
</p>
|
|
|
|
<p>
|
|
The objects in the Impala schema hierarchy are:
|
|
</p>
|
|
|
|
<codeblock>Server
|
|
URI
|
|
Database
|
|
Table
|
|
Column
|
|
</codeblock>
|
|
<p rev="2.3.0 collevelauth"> The table-level privileges apply to views as
|
|
well. Anywhere you specify a table name, you can specify a view name
|
|
instead. </p>
|
|
<p rev="2.3.0 collevelauth"> In <keyword keyref="impala23_full"/> and
|
|
higher, you can specify privileges for individual columns. </p>
|
|
|
|
<p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
|
|
<p> Originally, privileges were encoded in a policy file, stored in HDFS.
|
|
This mode of operation is still an option, but the emphasis of privilege
|
|
management is moving towards being SQL-based. The mode of operation with
|
|
<codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements instead
|
|
of the policy file requires that a special Sentry service be enabled;
|
|
this service stores, retrieves, and manipulates privilege information
|
|
stored inside the metastore database. </p>
|
|
|
|
<note>
|
|
<p>
|
|
Although this document refers to the <codeph>ALL</codeph> privilege, currently if you
|
|
use the policy file mode, you do not use the actual keyword <codeph>ALL</codeph> in
|
|
the policy file. When you code role entries in the policy file:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a server, use a role like
|
|
<codeph>server=<varname>server_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a database, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a table, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=*</codeph>.
|
|
</li>
|
|
</ul>
|
|
</note>
|
|
<p> If you change privileges in Sentry, e.g. adding a user, removing a
|
|
user, modifying privileges, you must clear the Impala Catalog server
|
|
cache by running the <codeph>INVALIDATE METADATA</codeph> statement.
|
|
<codeph>INVALIDATE METADATA</codeph> is not required if you make the
|
|
changes to privileges within Impala. </p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="secure_startup">
|
|
|
|
<title>Starting the impalad Daemon with Sentry Authorization Enabled</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
To run the <cmdname>impalad</cmdname> daemon with authorization enabled, you add one or more options to the
|
|
<codeph>IMPALA_SERVER_ARGS</codeph> declaration in the <filepath>/etc/default/impala</filepath>
|
|
configuration file:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
<codeph>-server_name</codeph>: Turns on Sentry authorization for
|
|
Impala. The authorization rules refer to a symbolic server name, and
|
|
you specify the same name to use as the argument to the
|
|
<codeph>-server_name</codeph> option for all
|
|
<cmdname>impalad</cmdname> nodes in the cluster. <p> Starting in
|
|
Impala 1.4.0 and higher, if you specify just
|
|
<codeph>-server_name</codeph> without
|
|
<codeph>-authorization_policy_file</codeph>, Impala uses the
|
|
Sentry service for authorization. </p>
|
|
</li>
|
|
<li>
|
|
<codeph>-sentry_config</codeph>: Specifies the local path to the
|
|
<codeph>sentry-site.xml</codeph> configuration file. This setting is
|
|
required to enable authorization. </li>
|
|
<li>
|
|
<codeph>-authorization_policy_file</codeph>: Specifies the HDFS path
|
|
to the policy file that defines the privileges on schema objects.
|
|
Prior to Impala 1.4.0, or if you want to continue storing privilege
|
|
rules in the policy file, specify the
|
|
<codeph>-authorization_policy_file</codeph> option to make Impala
|
|
read privilege information from a policy file, rather than from the
|
|
metastore database. </li>
|
|
</ul>
|
|
|
|
<p rev="1.4.0">
|
|
For example, you might adapt your <filepath>/etc/default/impala</filepath> configuration to contain lines
|
|
like the following. To use the Sentry service rather than the policy file:
|
|
</p>
|
|
|
|
<codeblock rev="1.4.0">IMPALA_SERVER_ARGS=" \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
Or to use the policy file, as in releases prior to Impala 1.4:
|
|
</p>
|
|
|
|
<codeblock>IMPALA_SERVER_ARGS=" \
|
|
-authorization_policy_file=/user/hive/warehouse/auth-policy.ini \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to
|
|
the current instance of Impala. Specify the symbolic name for the
|
|
<codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath>
|
|
configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for
|
|
<cmdname>impalad</cmdname>.
|
|
</p>
|
|
<p> Now restart the <cmdname>impalad</cmdname> daemons on all the nodes. </p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="sentry_service">
|
|
|
|
<title>Using Impala with the Sentry Service</title>
|
|
<conbody>
|
|
<p> When you use the Sentry service, you set up privileges through the
|
|
<codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in
|
|
either Impala or Hive. Then both components use those same privileges
|
|
automatically. (Impala added the <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements in <keyword keyref="impala20_full"
|
|
/>.) </p>
|
|
<p> For information about using the Impala <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements, see <xref
|
|
href="impala_grant.xml#grant"/> and <xref
|
|
href="impala_revoke.xml#revoke"/>. </p>
|
|
<p> URIs represent the file paths you specify as part of statements such
|
|
as <codeph>CREATE EXTERNAL TABLE</codeph> and <codeph>LOAD
|
|
DATA</codeph>. Typically, you specify what look like UNIX paths, but
|
|
these locations can also be prefixed with <codeph>hdfs://</codeph> to
|
|
make clear that they are really URIs. To set privileges for a URI,
|
|
specify the name of a directory, and the privilege applies to all the
|
|
files in that directory and any directories underneath it. </p>
|
|
<p> URIs must start with <codeph>hdfs://</codeph>,
|
|
<codeph>s3a://</codeph>, <codeph>adl://</codeph>, or
|
|
<codeph>file://</codeph>. If a URI starts with an absolute path, the
|
|
path will be appended to the default filesystem prefix. For example, if
|
|
you specify: <codeblock>
|
|
GRANT ALL ON URI '/tmp';
|
|
</codeblock> The above
|
|
statement effectively becomes the following where the default filesystem
|
|
is HDFS.
|
|
<codeblock>
|
|
GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
|
|
</codeblock>
|
|
</p>
|
|
<p> When defining URIs for HDFS, you must also specify the NameNode. For
|
|
example: <codeblock>GRANT ALL ON URI file:///path/to/dir TO <role>
|
|
GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role></codeblock>
|
|
<note type="warning">
|
|
<p> Because the NameNode host and port must be specified, it is
|
|
strongly recommended that you use High Availability (HA). This
|
|
ensures that the URI will remain constant even if the NameNode
|
|
changes. For example: </p>
|
|
<codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO <role></codeblock>
|
|
</note>
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
<concept id="concept_k45_lbm_f2b">
|
|
<title>Examples of Setting up Authorization for Security Scenarios</title>
|
|
<conbody>
|
|
<p> The following examples show how to set up authorization to deal with
|
|
various scenarios. </p>
|
|
<example>
|
|
<title>A User with No Privileges</title>
|
|
<p> If a user has no privileges at all, that user cannot access any
|
|
schema objects in the system. The error messages do not disclose the
|
|
names or existence of objects that the user is not authorized to read. </p>
|
|
<p> This is the experience you want a user to have if they somehow log
|
|
into a system where they are not an authorized Impala user. Or in a
|
|
real deployment, a user might have no privileges because they are not
|
|
a member of any of the authorized groups. </p>
|
|
</example>
|
|
<example>
|
|
<title>Examples of Privileges for Administrative Users</title>
|
|
<p> In this example, the SQL statements grant the
|
|
<codeph>entire_server</codeph> role all privileges on both the
|
|
databases and URIs within the server. </p>
|
|
<codeblock>CREATE ROLE entire_server;
|
|
GRANT ROLE entire_server TO GROUP admin_group;
|
|
GRANT ALL ON SERVER server1 TO ROLE entire_server;
|
|
</codeblock>
|
|
</example>
|
|
<example>
|
|
<title>A User with Privileges for Specific Databases and Tables</title>
|
|
<p> If a user has privileges for specific tables in specific databases,
|
|
the user can access those things but nothing else. They can see the
|
|
tables and their parent databases in the output of <codeph>SHOW
|
|
TABLES</codeph> and <codeph>SHOW DATABASES</codeph>,
|
|
<codeph>USE</codeph> the appropriate databases, and perform the
|
|
relevant actions (<codeph>SELECT</codeph> and/or
|
|
<codeph>INSERT</codeph>) based on the table privileges. To actually
|
|
create a table requires the <codeph>ALL</codeph> privilege at the
|
|
database level, so you might define separate roles for the user that
|
|
sets up a schema and other users or applications that perform
|
|
day-to-day operations on the tables. </p>
|
|
<codeblock>
|
|
CREATE ROLE one_database;
|
|
GRANT ROLE one_database TO GROUP admin_group;
|
|
GRANT ALL ON DATABASE db1 TO ROLE one_database;
|
|
|
|
CREATE ROLE instructor;
|
|
GRANT ROLE instructor TO GROUP trainers;
|
|
GRANT ALL ON TABLE db1.lesson TO ROLE instructor;
|
|
|
|
# This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP.
|
|
CREATE ROLE student;
|
|
GRANT ROLE student TO GROUP visitors;
|
|
GRANT SELECT ON TABLE db1.training TO ROLE student;</codeblock>
|
|
</example>
|
|
<example>
|
|
<title>Privileges for Working with External Data Files</title>
|
|
<p> When data is being inserted through the <codeph>LOAD DATA</codeph>
|
|
statement, or is referenced from an HDFS location outside the normal
|
|
Impala database directories, the user also needs appropriate
|
|
permissions on the URIs corresponding to those HDFS locations. </p>
|
|
<p> In this example: </p>
|
|
<ul>
|
|
<li> The <codeph>external_table</codeph> role can insert into and
|
|
query the Impala table, <codeph>external_table.sample</codeph>. </li>
|
|
<li> The <codeph>staging_dir</codeph> role can specify the HDFS path
|
|
<filepath>/user/impala-user/external_data</filepath> with the
|
|
<codeph>LOAD DATA</codeph> statement. When Impala queries or loads
|
|
data files, it operates on all the files in that directory, not just
|
|
a single file, so any Impala <codeph>LOCATION</codeph> parameters
|
|
refer to a directory rather than an individual file. </li>
|
|
</ul>
|
|
<codeblock>CREATE ROLE external_table;
|
|
GRANT ROLE external_table TO GROUP impala_users;
|
|
GRANT ALL ON TABLE external_table.sample TO ROLE external_table;
|
|
|
|
CREATE ROLE staging_dir;
|
|
GRANT ROLE staging TO GROUP impala_users;
|
|
GRANT ALL ON URI 'hdfs://127.0.0.1:8020/user/impala-user/external_data' TO ROLE staging_dir;</codeblock>
|
|
</example>
|
|
<example>
|
|
<title>Separating Administrator Responsibility from Read and Write
|
|
Privileges</title>
|
|
<p> To create a database, you need the full privilege on that database
|
|
while day-to-day operations on tables within that database can be
|
|
performed with lower levels of privilege on specific table. Thus, you
|
|
might set up separate roles for each database or application: an
|
|
administrative one that could create or drop the database, and a
|
|
user-level one that can access only the relevant tables. </p>
|
|
<p> In this example, the responsibilities are divided between users in 3
|
|
different groups: </p>
|
|
<ul>
|
|
<li> Members of the <codeph>supergroup</codeph> group have the
|
|
<codeph>training_sysadmin</codeph> role and so can set up a
|
|
database named <codeph>training</codeph>. </li>
|
|
<li> Members of the <codeph>impala_users</codeph> group have the
|
|
<codeph>instructor</codeph> role and so can create, insert into,
|
|
and query any tables in the <codeph>training</codeph> database, but
|
|
cannot create or drop the database itself. </li>
|
|
<li> Members of the <codeph>visitor</codeph> group have the
|
|
<codeph>student</codeph> role and so can query those tables in the
|
|
<codeph>training</codeph> database. </li>
|
|
</ul>
|
|
<codeblock>CREATE ROLE training_sysadmin;
|
|
GRANT ROLE training_sysadmin TO GROUP supergroup;
|
|
GRANT ALL ON DATABASE training1 TO ROLE training_sysadmin;
|
|
|
|
CREATE ROLE instructor;
|
|
GRANT ROLE instructor TO GROUP impala_users;
|
|
GRANT ALL ON TABLE training1.course1 TO ROLE instructor;
|
|
|
|
CREATE ROLE visitor;
|
|
GRANT ROLE student TO GROUP visitor;
|
|
GRANT SELECT ON TABLE training1.course1 TO ROLE student;</codeblock>
|
|
</example>
|
|
</conbody>
|
|
</concept>
|
|
<concept id="security_policy_file">
|
|
<title>Using Impala with the Sentry Policy File</title>
|
|
<conbody>
|
|
<p> The policy file is a file that you put in a designated location in
|
|
HDFS, and is read during the startup of the <cmdname>impalad</cmdname>
|
|
daemon when you specify both the <codeph>-server_name</codeph> and
|
|
<codeph>-authorization_policy_file</codeph> startup options. It
|
|
controls which objects (databases, tables, and HDFS directory paths) can
|
|
be accessed by the user who connects to <cmdname>impalad</cmdname>, and
|
|
what operations that user can perform on the objects. </p>
|
|
<note rev="1.4.0"> In <ph rev="upstream">CDH 5</ph> and higher, <ph
|
|
rev="upstream">Cloudera</ph> recommends managing privileges through
|
|
SQL statements, as described in <xref
|
|
href="impala_authorization.xml#sentry_service"/>. If you are still
|
|
using policy files, plan to migrate to the new approach some time in the
|
|
future. </note>
|
|
<p> The location of the policy file is listed in the
|
|
<filepath>auth-site.xml</filepath> configuration file. </p>
|
|
<p> When authorization is enabled, Impala uses the policy file as a
|
|
<i>whitelist</i>, representing every privilege available to any user
|
|
on any object. That is, only operations specified for the appropriate
|
|
combination of object, role, group, and user are allowed. All other
|
|
operations are not allowed. If a group or role is defined multiple times
|
|
in the policy file, the last definition takes precedence. </p>
|
|
<p> To understand the notion of whitelisting, set up a minimal policy file
|
|
that does not provide any privileges for any object. When you connect to
|
|
an Impala node where this policy file is in effect, you get no results
|
|
for <codeph>SHOW DATABASES</codeph>, and an error when you issue any
|
|
<codeph>SHOW TABLES</codeph>, <codeph>USE
|
|
<varname>database_name</varname></codeph>, <codeph>DESCRIBE
|
|
<varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and
|
|
or other statements that expect to access databases or tables, even if
|
|
the corresponding databases and tables exist. </p>
|
|
<p> The contents of the policy file are cached, to avoid a performance
|
|
penalty for each query. The policy file is re-checked by each
|
|
<cmdname>impalad</cmdname> node every 5 minutes. When you make a
|
|
non-time-sensitive change such as adding new privileges or new users,
|
|
you can let the change take effect automatically a few minutes later. If
|
|
you remove or reduce privileges, and want the change to take effect
|
|
immediately, restart the <cmdname>impalad</cmdname> daemon on all nodes,
|
|
again specifying the <codeph>-server_name</codeph> and
|
|
<codeph>-authorization_policy_file</codeph> options so that the rules
|
|
from the updated policy file are applied. </p>
|
|
</conbody>
|
|
<concept id="security_policy_file_details">
|
|
<title>Policy File Format</title>
|
|
<conbody>
|
|
<p> The policy file uses the familiar <codeph>.ini</codeph> format,
|
|
divided into the major sections <codeph>[groups]</codeph> and
|
|
<codeph>[roles]</codeph>. </p>
|
|
<p> There is also an optional <codeph>[databases]</codeph> section,
|
|
which allows you to specify a specific policy file for a particular
|
|
database, as explained in <xref href="#security_multiple_policy_files"
|
|
/>. </p>
|
|
<p> Another optional section, <codeph>[users]</codeph>, allows you to
|
|
override the OS-level mapping of users to groups; that is an advanced
|
|
technique primarily for testing and debugging, and is beyond the scope
|
|
of this document. </p>
|
|
<p> In the <codeph>[groups]</codeph> section, you define various
|
|
categories of users and select which roles are associated with each
|
|
category. The group and usernames correspond to Linux groups and users
|
|
on the server where the <cmdname>impalad</cmdname> daemon runs. </p>
|
|
<p> The group and usernames in the <codeph>[groups]</codeph> section
|
|
correspond to Hadoop groups and users on the server where the
|
|
<cmdname>impalad</cmdname> daemon runs. When you access Impala
|
|
through the <cmdname>impalad</cmdname> interpreter, for purposes of
|
|
authorization, the user is the logged-in Linux user and the groups are
|
|
the Linux groups that user is a member of. When you access Impala
|
|
through the ODBC or JDBC interfaces, the user and password specified
|
|
through the connection string are used as login credentials for the
|
|
Linux server, and authorization is based on that username and the
|
|
associated Linux group membership. </p>
|
|
<p> In the <codeph>[roles]</codeph> section, you a set of roles. For
|
|
each role, you specify precisely the set of privileges is available.
|
|
That is, which objects users with that role can access, and what
|
|
operations they can perform on those objects. This is the lowest-level
|
|
category of security information; the other sections in the policy
|
|
file map the privileges to higher-level divisions of groups and users.
|
|
In the <codeph>[groups]</codeph> section, you specify which roles are
|
|
associated with which groups. The group and usernames correspond to
|
|
Linux groups and users on the server where the
|
|
<cmdname>impalad</cmdname> daemon runs. The privileges are specified
|
|
using patterns like:
|
|
<codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=t<varname>able_name</varname>->action=CREATE
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL
|
|
</codeblock>
|
|
For the <varname>server_name</varname> value, substitute the same
|
|
symbolic name you specify with the <cmdname>impalad</cmdname>
|
|
<codeph>-server_name</codeph> option. You can use <codeph>*</codeph>
|
|
wildcard characters at each level of the privilege specification to
|
|
allow access to all such objects. For example:
|
|
<codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
|
|
server=impala-host.example.com->db=*->table=*->action=CREATE
|
|
server=impala-host.example.com->db=*->table=audit_log->action=SELECT
|
|
server=impala-host.example.com->db=default->table=t1->action=*
|
|
</codeblock>
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
<concept id="security_multiple_policy_files">
|
|
<title>Using Multiple Policy Files for Different Databases</title>
|
|
<conbody>
|
|
<p> For an Impala cluster with many databases being accessed by many
|
|
users and applications, it might be cumbersome to update the security
|
|
policy file for each privilege change or each new database, table, or
|
|
view. You can allow security to be managed separately for individual
|
|
databases, by setting up a separate policy file for each database: </p>
|
|
<ul>
|
|
<li> Add the optional <codeph>[databases]</codeph> section to the main
|
|
policy file. </li>
|
|
<li> Add entries in the <codeph>[databases]</codeph> section for each
|
|
database that has its own policy file. </li>
|
|
<li> For each listed database, specify the HDFS path of the
|
|
appropriate policy file. </li>
|
|
</ul>
|
|
<p> For example: </p>
|
|
<codeblock>[databases]
|
|
# Defines the location of the per-DB policy files for the 'customers' and 'sales' databases.
|
|
customers = hdfs://ha-nn-uri/etc/access/customers.ini
|
|
sales = hdfs://ha-nn-uri/etc/access/sales.ini
|
|
</codeblock>
|
|
<p> To enable URIs in per-DB policy files, the Java configuration option
|
|
<codeph>sentry.allow.uri.db.policyfile</codeph> must be set to
|
|
<codeph>true</codeph>. For example: </p>
|
|
<codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true"
|
|
</codeblock>
|
|
<note type="important"> Enabling URIs in per-DB policy files introduces
|
|
a security risk by allowing the owner of the db-level policy file to
|
|
grant himself/herself load privileges to anything the
|
|
<codeph>impala</codeph> user has read permissions for in HDFS
|
|
(including data in other databases controlled by different db-level
|
|
policy files). </note>
|
|
</conbody>
|
|
</concept>
|
|
</concept>
|
|
<concept id="security_schema">
|
|
<title>Setting Up Schema Objects for a Secure Impala Deployment</title>
|
|
<conbody>
|
|
<p> In your role definitions, you must specify privileges at the level of
|
|
individual databases and tables, or all databases or all tables within a
|
|
database. To simplify the structure of these rules, plan ahead of time
|
|
how to name your schema objects so that data with different
|
|
authorization requirements is divided into separate databases. </p>
|
|
<p> If you are adding security on top of an existing Impala deployment,
|
|
you can rename tables or even move them between databases using the
|
|
<codeph>ALTER TABLE</codeph> statement. </p>
|
|
</conbody>
|
|
</concept>
|
|
<concept id="sentry_debug">
|
|
<title><ph conref="../shared/impala_common.xml#common/title_sentry_debug"
|
|
/></title>
|
|
<conbody>
|
|
<p conref="../shared/impala_common.xml#common/sentry_debug"/>
|
|
</conbody>
|
|
</concept>
|
|
<concept id="sec_ex_default">
|
|
<title>The DEFAULT Database in a Secure Deployment</title>
|
|
<conbody>
|
|
<p> Because of the extra emphasis on granular access controls in a secure
|
|
deployment, you should move any important or sensitive information out
|
|
of the <codeph>DEFAULT</codeph> database into a named database whose
|
|
privileges are specified in the policy file. Sometimes you might need to
|
|
give privileges on the <codeph>DEFAULT</codeph> database for
|
|
administrative reasons; for example, as a place you can reliably specify
|
|
with a <codeph>USE</codeph> statement when preparing to drop a database.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
</concept>
|