The Impala logs record information about:
Formerly, the logs contained the query profile for each query, showing low-level details of how the work is
distributed among nodes and how intermediate and final results are transmitted across the network. To save
space, those query profiles are now stored in zlib-compressed files in
The auditing feature introduced in Impala 1.1.1 produces a separate set of audit log files when
enabled. See
In
The lineage feature introduced in Impala 2.2.0 produces a separate lineage log file when
enabled. See
Impala stores information using the
Review Impala log files on each host, when you have traced an issue back to a specific system.
Impala periodically switches the physical files representing the current log files, after which it is safe to remove the old files if they are no longer needed.
Impala can automatically remove older unneeded log files, a feature known as
In Impala 2.2 and higher, the
A value of 0 preserves all log files, in which case you would set up set up manual log rotation using your Linux tool or technique of choice. A value of 1 preserves only the very latest log file.
By default, the Impala log is stored at
On a machine named
The web interface limits the amount of logging information displayed. To view every log entry, access the log files directly through the file system.
You can view the contents of the
The logs store information about Impala startup options. This information appears once for each time Impala is started and may include:
There is information about each job Impala has run. Because each Impala job creates an additional set of data about queries, the amount of job specific data may be very large. Logs may contained detailed information on jobs. These detailed log entries may include:
Impala uses the GLOG system, which supports three logging levels. You can adjust logging levels
by exporting variable settings. To change logging settings manually, use a command
similar to the following on each node before starting
For more information on how to configure GLOG, including how to set variable logging levels for different
system components, see
As logging levels increase, the categories of information logged are cumulative. For example, GLOG_v=2 records everything GLOG_v=1 records, as well as additional information.
Increasing logging levels imposes performance overhead and increases log size. Where practical, use GLOG_v=1 for most cases: this level has minimal performance impact but still captures useful troubleshooting information.
Additional information logged at each level is as follows:
In a security context, the log redaction feature is complementary to the Sentry authorization framework. Sentry prevents unauthorized users from being able to directly access table data. Redaction prevents administrators or support personnel from seeing the smaller amounts of sensitive or personally identifying information (PII) that might appear in queries issued by those authorized users.
See