mirror of
https://github.com/apache/impala.git
synced 2026-01-26 03:01:30 -05:00
Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Reviewed-on: http://gerrit.cloudera.org:8080/11386 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Adam Holley <aholley@cloudera.com> Reviewed-by: Alex Rodoni <arodoni@cloudera.com>
713 lines
28 KiB
XML
713 lines
28 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept rev="1.1" id="authorization">
|
|
|
|
<title>Enabling Sentry Authorization for Impala</title>
|
|
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Security"/>
|
|
<data name="Category" value="Sentry"/>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Configuring"/>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
<data name="Category" value="Users"/>
|
|
<data name="Category" value="Groups"/>
|
|
<data name="Category" value="Administrators"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody id="sentry">
|
|
|
|
<p>
|
|
Authorization determines which users are allowed to access which resources, and what
|
|
operations they are allowed to perform. In Impala 1.1 and higher, you use Apache Sentry
|
|
for authorization. Sentry adds a fine-grained authorization framework for Hadoop. By
|
|
default (when authorization is not enabled), Impala does all read and write operations
|
|
with the privileges of the <codeph>impala</codeph> user, which is suitable for a
|
|
development/test environment but not for a secure production environment. When
|
|
authorization is enabled, Impala uses the OS user ID of the user who runs
|
|
<cmdname>impala-shell</cmdname> or other client program, and associates various privileges
|
|
with each user.
|
|
</p>
|
|
|
|
<note>
|
|
Sentry is typically used in conjunction with Kerberos authentication, which defines which
|
|
hosts are allowed to connect to each server. Using the combination of Sentry and Kerberos
|
|
prevents malicious users from being able to connect by creating a named account on an
|
|
untrusted machine. See <xref href="impala_kerberos.xml#kerberos"/> for details about
|
|
Kerberos authentication.
|
|
</note>
|
|
|
|
<p audience="PDF" outputclass="toc inpage">
|
|
See the following sections for details about using the Impala authorization features:
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="sentry_priv_model">
|
|
|
|
<title>The Sentry Privilege Model</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Privileges can be granted on different objects in the schema. Any privilege that can be
|
|
granted is associated with a level in the object hierarchy. If a privilege is granted on
|
|
a parent object in the hierarchy, the child object automatically inherits it. This is
|
|
the same privilege model as Hive and other database systems.
|
|
</p>
|
|
|
|
<p>
|
|
The objects in the Impala schema hierarchy are:
|
|
</p>
|
|
|
|
<codeblock>Server
|
|
URI
|
|
Database
|
|
Table
|
|
Column
|
|
</codeblock>
|
|
|
|
<p rev="2.3.0 collevelauth">
|
|
The table-level privileges apply to views as well. Anywhere you specify a table name,
|
|
you can specify a view name instead.
|
|
</p>
|
|
|
|
<p rev="2.3.0 collevelauth">
|
|
In <keyword keyref="impala23_full"/> and higher, you can specify privileges for
|
|
individual columns.
|
|
</p>
|
|
|
|
<p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
|
|
|
|
<p>
|
|
Originally, privileges were encoded in a policy file, stored in HDFS. This mode of
|
|
operation is still an option, but the emphasis of privilege management is moving towards
|
|
being SQL-based. The mode of operation with <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements instead of the policy file requires that a special
|
|
Sentry service be enabled; this service stores, retrieves, and manipulates privilege
|
|
information stored inside the metastore database.
|
|
</p>
|
|
|
|
<note>
|
|
<p>
|
|
Although this document refers to the <codeph>ALL</codeph> privilege, currently if you
|
|
use the policy file mode, you do not use the actual keyword <codeph>ALL</codeph> in
|
|
the policy file. When you code role entries in the policy file:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a server, use a role like
|
|
<codeph>server=<varname>server_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a database, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname></codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
To specify the <codeph>ALL</codeph> privilege for a table, use a role like
|
|
<codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=*</codeph>.
|
|
</li>
|
|
</ul>
|
|
</note>
|
|
|
|
<p>
|
|
If you change privileges in Sentry, e.g. adding a user, removing a user, modifying
|
|
privileges, you must clear the Impala Catalog server cache by running the
|
|
<codeph>INVALIDATE METADATA</codeph> statement. <codeph>INVALIDATE METADATA</codeph> is
|
|
not required if you make the changes to privileges within Impala.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="secure_startup">
|
|
|
|
<title>Starting the impalad Daemon with Sentry Authorization Enabled</title>
|
|
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Starting and Stopping"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
To run the <cmdname>impalad</cmdname> daemon with authorization enabled, you add one or
|
|
more options to the <codeph>IMPALA_SERVER_ARGS</codeph> declaration in the
|
|
<filepath>/etc/default/impala</filepath> configuration file:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<codeph>-server_name</codeph>: Turns on Sentry authorization for Impala. The
|
|
authorization rules refer to a symbolic server name, and you specify the same name to
|
|
use as the argument to the <codeph>-server_name</codeph> option for all
|
|
<cmdname>impalad</cmdname> nodes in the cluster.
|
|
<p>
|
|
Starting in Impala 1.4.0 and higher, if you specify just
|
|
<codeph>-server_name</codeph> without <codeph>-authorization_policy_file</codeph>,
|
|
Impala uses the Sentry service for authorization.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<codeph>-sentry_config</codeph>: Specifies the local path to the
|
|
<codeph>sentry-site.xml</codeph> configuration file. This setting is required to
|
|
enable authorization.
|
|
</li>
|
|
|
|
<li>
|
|
<codeph>-authorization_policy_file</codeph>: Specifies the HDFS path to the policy
|
|
file that defines the privileges on schema objects. Prior to Impala 1.4.0, or if you
|
|
want to continue storing privilege rules in the policy file, specify the
|
|
<codeph>-authorization_policy_file</codeph> option to make Impala read privilege
|
|
information from a policy file, rather than from the metastore database.
|
|
</li>
|
|
</ul>
|
|
|
|
<p rev="1.4.0">
|
|
For example, you might adapt your <filepath>/etc/default/impala</filepath> configuration
|
|
to contain lines like the following. To use the Sentry service rather than the policy
|
|
file:
|
|
</p>
|
|
|
|
<codeblock rev="1.4.0">IMPALA_SERVER_ARGS=" \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
Or to use the policy file, as in releases prior to Impala 1.4:
|
|
</p>
|
|
|
|
<codeblock>IMPALA_SERVER_ARGS=" \
|
|
-authorization_policy_file=/user/hive/warehouse/auth-policy.ini \
|
|
-server_name=server1 \
|
|
...
|
|
</codeblock>
|
|
|
|
<p>
|
|
The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to
|
|
the current instance of Impala. Specify the symbolic name for the
|
|
<codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath>
|
|
configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for
|
|
<cmdname>impalad</cmdname>.
|
|
</p>
|
|
|
|
<p>
|
|
Now restart the <cmdname>impalad</cmdname> daemons on all the nodes.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="sentry_service">
|
|
|
|
<title>Using Impala with the Sentry Service</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
When you use the Sentry service, you set up privileges through the
|
|
<codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in either Impala or Hive.
|
|
Then both components use those same privileges automatically. (Impala added the
|
|
<codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in
|
|
<keyword keyref="impala20_full"
|
|
/>.)
|
|
</p>
|
|
|
|
<p>
|
|
For information about using the Impala <codeph>GRANT</codeph> and
|
|
<codeph>REVOKE</codeph> statements, see <xref
|
|
href="impala_grant.xml#grant"/>
|
|
and <xref
|
|
href="impala_revoke.xml#revoke"/>.
|
|
</p>
|
|
|
|
<p>
|
|
URIs represent the file paths you specify as part of statements such as <codeph>CREATE
|
|
EXTERNAL TABLE</codeph> and <codeph>LOAD DATA</codeph>. Typically, you specify what look
|
|
like UNIX paths, but these locations can also be prefixed with <codeph>hdfs://</codeph>
|
|
to make clear that they are really URIs. To set privileges for a URI, specify the name
|
|
of a directory, and the privilege applies to all the files in that directory and any
|
|
directories underneath it.
|
|
</p>
|
|
|
|
<p>
|
|
URIs must start with <codeph>hdfs://</codeph>, <codeph>s3a://</codeph>,
|
|
<codeph>adl://</codeph>, or <codeph>file://</codeph>. If a URI starts with an absolute
|
|
path, the path will be appended to the default filesystem prefix. For example, if you
|
|
specify:
|
|
<codeblock>
|
|
GRANT ALL ON URI '/tmp';
|
|
</codeblock>
|
|
The above statement effectively becomes the following where the default filesystem is
|
|
HDFS.
|
|
<codeblock>
|
|
GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
|
|
</codeblock>
|
|
</p>
|
|
|
|
<p>
|
|
When defining URIs for HDFS, you must also specify the NameNode. For example:
|
|
<codeblock>GRANT ALL ON URI file:///path/to/dir TO <role>
|
|
GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role></codeblock>
|
|
<note type="warning">
|
|
<p>
|
|
Because the NameNode host and port must be specified, it is strongly recommended
|
|
that you use High Availability (HA). This ensures that the URI will remain constant
|
|
even if the NameNode changes. For example:
|
|
</p>
|
|
<codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO <role></codeblock>
|
|
</note>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="concept_k45_lbm_f2b">
|
|
|
|
<title>Examples of Setting up Authorization for Security Scenarios</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The following examples show how to set up authorization to deal with various scenarios.
|
|
</p>
|
|
|
|
<example>
|
|
|
|
<title>A User with No Privileges</title>
|
|
|
|
<p>
|
|
If a user has no privileges at all, that user cannot access any schema objects in the
|
|
system. The error messages do not disclose the names or existence of objects that the
|
|
user is not authorized to read.
|
|
</p>
|
|
|
|
<p>
|
|
This is the experience you want a user to have if they somehow log into a system where
|
|
they are not an authorized Impala user. Or in a real deployment, a user might have no
|
|
privileges because they are not a member of any of the authorized groups.
|
|
</p>
|
|
|
|
</example>
|
|
|
|
<example>
|
|
|
|
<title>Examples of Privileges for Administrative Users</title>
|
|
|
|
<p>
|
|
In this example, the SQL statements grant the <codeph>entire_server</codeph> role all
|
|
privileges on both the databases and URIs within the server.
|
|
</p>
|
|
|
|
<codeblock>CREATE ROLE entire_server;
|
|
GRANT ROLE entire_server TO GROUP admin_group;
|
|
GRANT ALL ON SERVER server1 TO ROLE entire_server;
|
|
</codeblock>
|
|
|
|
</example>
|
|
|
|
<example>
|
|
|
|
<title>A User with Privileges for Specific Databases and Tables</title>
|
|
|
|
<p>
|
|
If a user has privileges for specific tables in specific databases, the user can
|
|
access those things but nothing else. They can see the tables and their parent
|
|
databases in the output of <codeph>SHOW TABLES</codeph> and <codeph>SHOW
|
|
DATABASES</codeph>, <codeph>USE</codeph> the appropriate databases, and perform the
|
|
relevant actions (<codeph>SELECT</codeph> and/or <codeph>INSERT</codeph>) based on the
|
|
table privileges. To actually create a table requires the <codeph>ALL</codeph>
|
|
privilege at the database level, so you might define separate roles for the user that
|
|
sets up a schema and other users or applications that perform day-to-day operations on
|
|
the tables.
|
|
</p>
|
|
|
|
<codeblock>
|
|
CREATE ROLE one_database;
|
|
GRANT ROLE one_database TO GROUP admin_group;
|
|
GRANT ALL ON DATABASE db1 TO ROLE one_database;
|
|
|
|
CREATE ROLE instructor;
|
|
GRANT ROLE instructor TO GROUP trainers;
|
|
GRANT ALL ON TABLE db1.lesson TO ROLE instructor;
|
|
|
|
# This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP.
|
|
CREATE ROLE student;
|
|
GRANT ROLE student TO GROUP visitors;
|
|
GRANT SELECT ON TABLE db1.training TO ROLE student;</codeblock>
|
|
|
|
</example>
|
|
|
|
<example>
|
|
|
|
<title>Privileges for Working with External Data Files</title>
|
|
|
|
<p>
|
|
When data is being inserted through the <codeph>LOAD DATA</codeph> statement, or is
|
|
referenced from an HDFS location outside the normal Impala database directories, the
|
|
user also needs appropriate permissions on the URIs corresponding to those HDFS
|
|
locations.
|
|
</p>
|
|
|
|
<p>
|
|
In this example:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
The <codeph>external_table</codeph> role can insert into and query the Impala table,
|
|
<codeph>external_table.sample</codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
The <codeph>staging_dir</codeph> role can specify the HDFS path
|
|
<filepath>/user/impala-user/external_data</filepath> with the <codeph>LOAD
|
|
DATA</codeph> statement. When Impala queries or loads data files, it operates on all
|
|
the files in that directory, not just a single file, so any Impala
|
|
<codeph>LOCATION</codeph> parameters refer to a directory rather than an individual
|
|
file.
|
|
</li>
|
|
</ul>
|
|
|
|
<codeblock>CREATE ROLE external_table;
|
|
GRANT ROLE external_table TO GROUP impala_users;
|
|
GRANT ALL ON TABLE external_table.sample TO ROLE external_table;
|
|
|
|
CREATE ROLE staging_dir;
|
|
GRANT ROLE staging TO GROUP impala_users;
|
|
GRANT ALL ON URI 'hdfs://127.0.0.1:8020/user/impala-user/external_data' TO ROLE staging_dir;</codeblock>
|
|
|
|
</example>
|
|
|
|
<example>
|
|
|
|
<title>Separating Administrator Responsibility from Read and Write Privileges</title>
|
|
|
|
<p>
|
|
To create a database, you need the full privilege on that database while day-to-day
|
|
operations on tables within that database can be performed with lower levels of
|
|
privilege on specific table. Thus, you might set up separate roles for each database
|
|
or application: an administrative one that could create or drop the database, and a
|
|
user-level one that can access only the relevant tables.
|
|
</p>
|
|
|
|
<p>
|
|
In this example, the responsibilities are divided between users in 3 different groups:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
Members of the <codeph>supergroup</codeph> group have the
|
|
<codeph>training_sysadmin</codeph> role and so can set up a database named
|
|
<codeph>training</codeph>.
|
|
</li>
|
|
|
|
<li>
|
|
Members of the <codeph>impala_users</codeph> group have the
|
|
<codeph>instructor</codeph> role and so can create, insert into, and query any
|
|
tables in the <codeph>training</codeph> database, but cannot create or drop the
|
|
database itself.
|
|
</li>
|
|
|
|
<li>
|
|
Members of the <codeph>visitor</codeph> group have the <codeph>student</codeph> role
|
|
and so can query those tables in the <codeph>training</codeph> database.
|
|
</li>
|
|
</ul>
|
|
|
|
<codeblock>CREATE ROLE training_sysadmin;
|
|
GRANT ROLE training_sysadmin TO GROUP supergroup;
|
|
GRANT ALL ON DATABASE training1 TO ROLE training_sysadmin;
|
|
|
|
CREATE ROLE instructor;
|
|
GRANT ROLE instructor TO GROUP impala_users;
|
|
GRANT ALL ON TABLE training1.course1 TO ROLE instructor;
|
|
|
|
CREATE ROLE visitor;
|
|
GRANT ROLE student TO GROUP visitor;
|
|
GRANT SELECT ON TABLE training1.course1 TO ROLE student;</codeblock>
|
|
|
|
</example>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="security_policy_file">
|
|
|
|
<title>Using Impala with the Sentry Policy File</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The policy file is a file that you put in a designated location in HDFS, and is read
|
|
during the startup of the <cmdname>impalad</cmdname> daemon when you specify both the
|
|
<codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> startup
|
|
options. It controls which objects (databases, tables, and HDFS directory paths) can be
|
|
accessed by the user who connects to <cmdname>impalad</cmdname>, and what operations
|
|
that user can perform on the objects.
|
|
</p>
|
|
|
|
<note rev="1.4.0">
|
|
The policy-file based authorization was deprecated in <keyword keyref="impala26"/>. We
|
|
recommend managing privileges through SQL statements as described in
|
|
<xref
|
|
href="impala_authorization.xml#sentry_service"/>. If you are still using
|
|
policy files, plan to migrate to the new approach some time in the future.
|
|
</note>
|
|
|
|
<p>
|
|
The location of the policy file is listed in the <filepath>auth-site.xml</filepath>
|
|
configuration file.
|
|
</p>
|
|
|
|
<p>
|
|
When authorization is enabled, Impala uses the policy file as a <i>whitelist</i>,
|
|
representing every privilege available to any user on any object. That is, only
|
|
operations specified for the appropriate combination of object, role, group, and user
|
|
are allowed. All other operations are not allowed. If a group or role is defined
|
|
multiple times in the policy file, the last definition takes precedence.
|
|
</p>
|
|
|
|
<p>
|
|
To understand the notion of whitelisting, set up a minimal policy file that does not
|
|
provide any privileges for any object. When you connect to an Impala node where this
|
|
policy file is in effect, you get no results for <codeph>SHOW DATABASES</codeph>, and an
|
|
error when you issue any <codeph>SHOW TABLES</codeph>, <codeph>USE
|
|
<varname>database_name</varname></codeph>, <codeph>DESCRIBE
|
|
<varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and or other statements
|
|
that expect to access databases or tables, even if the corresponding databases and
|
|
tables exist.
|
|
</p>
|
|
|
|
<p>
|
|
The contents of the policy file are cached, to avoid a performance penalty for each
|
|
query. The policy file is re-checked by each <cmdname>impalad</cmdname> node every 5
|
|
minutes. When you make a non-time-sensitive change such as adding new privileges or new
|
|
users, you can let the change take effect automatically a few minutes later. If you
|
|
remove or reduce privileges, and want the change to take effect immediately, restart the
|
|
<cmdname>impalad</cmdname> daemon on all nodes, again specifying the
|
|
<codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> options so
|
|
that the rules from the updated policy file are applied.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="security_policy_file_details">
|
|
|
|
<title>Policy File Format</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The policy file uses the familiar <codeph>.ini</codeph> format, divided into the major
|
|
sections <codeph>[groups]</codeph> and <codeph>[roles]</codeph>.
|
|
</p>
|
|
|
|
<p>
|
|
There is also an optional <codeph>[databases]</codeph> section, which allows you to
|
|
specify a specific policy file for a particular database, as explained in
|
|
<xref href="#security_multiple_policy_files"
|
|
/>.
|
|
</p>
|
|
|
|
<p>
|
|
Another optional section, <codeph>[users]</codeph>, allows you to override the
|
|
OS-level mapping of users to groups; that is an advanced technique primarily for
|
|
testing and debugging, and is beyond the scope of this document.
|
|
</p>
|
|
|
|
<p>
|
|
In the <codeph>[groups]</codeph> section, you define various categories of users and
|
|
select which roles are associated with each category. The group and usernames
|
|
correspond to Linux groups and users on the server where the
|
|
<cmdname>impalad</cmdname> daemon runs.
|
|
</p>
|
|
|
|
<p>
|
|
The group and usernames in the <codeph>[groups]</codeph> section correspond to Hadoop
|
|
groups and users on the server where the <cmdname>impalad</cmdname> daemon runs. When
|
|
you access Impala through the <cmdname>impalad</cmdname> interpreter, for purposes of
|
|
authorization, the user is the logged-in Linux user and the groups are the Linux
|
|
groups that user is a member of. When you access Impala through the ODBC or JDBC
|
|
interfaces, the user and password specified through the connection string are used as
|
|
login credentials for the Linux server, and authorization is based on that username
|
|
and the associated Linux group membership.
|
|
</p>
|
|
|
|
<p>
|
|
In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you
|
|
specify precisely the set of privileges is available. That is, which objects users
|
|
with that role can access, and what operations they can perform on those objects. This
|
|
is the lowest-level category of security information; the other sections in the policy
|
|
file map the privileges to higher-level divisions of groups and users. In the
|
|
<codeph>[groups]</codeph> section, you specify which roles are associated with which
|
|
groups. The group and usernames correspond to Linux groups and users on the server
|
|
where the <cmdname>impalad</cmdname> daemon runs. The privileges are specified using
|
|
patterns like:
|
|
<codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=t<varname>able_name</varname>->action=CREATE
|
|
server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL
|
|
</codeblock>
|
|
For the <varname>server_name</varname> value, substitute the same symbolic name you
|
|
specify with the <cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You
|
|
can use <codeph>*</codeph> wildcard characters at each level of the privilege
|
|
specification to allow access to all such objects. For example:
|
|
<codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT
|
|
server=impala-host.example.com->db=*->table=*->action=CREATE
|
|
server=impala-host.example.com->db=*->table=audit_log->action=SELECT
|
|
server=impala-host.example.com->db=default->table=t1->action=*
|
|
</codeblock>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="security_multiple_policy_files">
|
|
|
|
<title>Using Multiple Policy Files for Different Databases</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
For an Impala cluster with many databases being accessed by many users and
|
|
applications, it might be cumbersome to update the security policy file for each
|
|
privilege change or each new database, table, or view. You can allow security to be
|
|
managed separately for individual databases, by setting up a separate policy file for
|
|
each database:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
Add the optional <codeph>[databases]</codeph> section to the main policy file.
|
|
</li>
|
|
|
|
<li>
|
|
Add entries in the <codeph>[databases]</codeph> section for each database that has
|
|
its own policy file.
|
|
</li>
|
|
|
|
<li>
|
|
For each listed database, specify the HDFS path of the appropriate policy file.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
For example:
|
|
</p>
|
|
|
|
<codeblock>[databases]
|
|
# Defines the location of the per-DB policy files for the 'customers' and 'sales' databases.
|
|
customers = hdfs://ha-nn-uri/etc/access/customers.ini
|
|
sales = hdfs://ha-nn-uri/etc/access/sales.ini
|
|
</codeblock>
|
|
|
|
<p>
|
|
To enable URIs in per-DB policy files, the Java configuration option
|
|
<codeph>sentry.allow.uri.db.policyfile</codeph> must be set to <codeph>true</codeph>.
|
|
For example:
|
|
</p>
|
|
|
|
<codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true"
|
|
</codeblock>
|
|
|
|
<note type="important">
|
|
Enabling URIs in per-DB policy files introduces a security risk by allowing the owner
|
|
of the db-level policy file to grant himself/herself load privileges to anything the
|
|
<codeph>impala</codeph> user has read permissions for in HDFS (including data in other
|
|
databases controlled by different db-level policy files).
|
|
</note>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="security_schema">
|
|
|
|
<title>Setting Up Schema Objects for a Secure Impala Deployment</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
In your role definitions, you must specify privileges at the level of individual
|
|
databases and tables, or all databases or all tables within a database. To simplify the
|
|
structure of these rules, plan ahead of time how to name your schema objects so that
|
|
data with different authorization requirements is divided into separate databases.
|
|
</p>
|
|
|
|
<p>
|
|
If you are adding security on top of an existing Impala deployment, you can rename
|
|
tables or even move them between databases using the <codeph>ALTER TABLE</codeph>
|
|
statement.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="sentry_debug">
|
|
|
|
<title><ph conref="../shared/impala_common.xml#common/title_sentry_debug"
|
|
/></title>
|
|
|
|
<conbody>
|
|
|
|
<p conref="../shared/impala_common.xml#common/sentry_debug"/>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="sec_ex_default">
|
|
|
|
<title>The DEFAULT Database in a Secure Deployment</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Because of the extra emphasis on granular access controls in a secure deployment, you
|
|
should move any important or sensitive information out of the <codeph>DEFAULT</codeph>
|
|
database into a named database whose privileges are specified in the policy file.
|
|
Sometimes you might need to give privileges on the <codeph>DEFAULT</codeph> database for
|
|
administrative reasons; for example, as a place you can reliably specify with a
|
|
<codeph>USE</codeph> statement when preparing to drop a database.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|