mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
Updated all the associated topics. Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d Reviewed-on: http://gerrit.cloudera.org:8080/17469 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
86 lines
3.8 KiB
XML
86 lines
3.8 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept id="secure_files">
|
|
|
|
<title>Securing Impala Data and Log Files</title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Security"/>
|
|
<data name="Category" value="Logs"/>
|
|
<data name="Category" value="HDFS"/>
|
|
<data name="Category" value="Administrators"/>
|
|
<!-- To do for John: mention redaction as a fallback to keep sensitive info out of the log files. -->
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
One aspect of security is to protect files from unauthorized access at the filesystem level. For example, if
|
|
you store sensitive data in HDFS, you specify permissions on the associated files and directories in HDFS to
|
|
restrict read and write permissions to the appropriate users and groups.
|
|
</p>
|
|
|
|
<p>
|
|
If you issue queries containing sensitive values in the <codeph>WHERE</codeph> clause, such as financial
|
|
account numbers, those values are stored in Impala log files in the Linux filesystem and you must secure
|
|
those files also. For the locations of Impala log files, see <xref href="impala_logging.xml#logging"/>.
|
|
</p>
|
|
|
|
<p>
|
|
All Impala read and write operations are performed under the filesystem privileges of the
|
|
<codeph>impala</codeph> user. The <codeph>impala</codeph> user must be able to read all directories and data
|
|
files that you query, and write into all the directories and data files for <codeph>INSERT</codeph> and
|
|
<codeph>LOAD DATA</codeph> statements. At a minimum, make sure the <codeph>impala</codeph> user is in the
|
|
<codeph>hive</codeph> group so that it can access files and directories shared between Impala and Hive. See
|
|
<xref href="impala_prereqs.xml#prereqs_account"/> for more details.
|
|
</p>
|
|
|
|
<p>
|
|
Setting file permissions is necessary for Impala to function correctly, but is not an effective security
|
|
practice by itself:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
The way to ensure that only authorized users can submit requests for databases and tables they are allowed
|
|
to access is to set up Ranger authorization, as explained in
|
|
<xref href="impala_authorization.xml#authorization"/>. With authorization enabled, the checking of the user
|
|
ID and group is done by Impala, and unauthorized access is blocked by Impala itself. The actual low-level
|
|
read and write requests are still done by the <codeph>impala</codeph> user, so you must have appropriate
|
|
file and directory permissions for that user ID.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
You must also set up Kerberos authentication, as described in <xref href="impala_kerberos.xml#kerberos"/>,
|
|
so that users can only connect from trusted hosts. With Kerberos enabled, if someone connects a new host to
|
|
the network and creates user IDs that match your privileged IDs, they will be blocked from connecting to
|
|
Impala at all from that host.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
</conbody>
|
|
</concept>
|