The following are the major steps to harden a cluster running Impala against accidents and mistakes, or malicious attackers trying to access sensitive data:
Secure the
Restrict membership in the
Ensure the Hadoop ownership and permissions for Impala data files are restricted.
Ensure the Hadoop ownership and permissions for Impala log files are restricted.
Ensure that the Impala web UI (available by default on port 25000 on each Impala node) is
password-protected. See
Create a policy file that specifies which Impala privileges are available to users in particular Hadoop
groups (which by default map to Linux OS groups). Create the associated Linux groups using the
The Impala authorization feature makes use of the HDFS file ownership and permissions mechanism; for
background information, see the
Design your databases, tables, and views with database and table structure to allow policy rules to specify
simple, consistent rules. For example, if all tables related to an application are inside a single
database, you can assign privileges for that database and use the
Enable authorization by running the
Set up authentication using Kerberos, to make sure users really are who they say they are.