IMPALA-13588: Update Puffin reading doc after IMPALA-13370

IMPALA-13370 added support for reading Puffin NDV stats from the
metadata.json if the "NDV" property is available. This change updates
the docs accordingly.

Change-Id: I95f5454d736ffb3a2c043f9b490c62976ccd0c2a
Reviewed-on: http://gerrit.cloudera.org:8080/22140
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Noemi Pap-Takacs <npaptakacs@cloudera.com>
Reviewed-by: Peter Rozsa <prozsa@cloudera.com>
This commit is contained in:
Daniel Becker
2024-11-28 12:20:15 +01:00
committed by Peter Rozsa
parent 907c1738a0
commit b49f45eacb

View File

@@ -879,6 +879,12 @@ ORDER BY made_current_at;
values in the HMS may be stale. values in the HMS may be stale.
</p> </p>
<p> <p>
Some engines, e.g. Trino, also write the NDV as a property (with key "ndv") in the
"statistics" section of the metadata.json file for each blob, in addition to the
Puffin file. If such a property is present for a blob, Impala will read the value
from the metadata.json file instead of the Puffin file to reduce file I/O.
</p>
<p>
Note that it is currently not possible to drop Puffin stats from Impala. Note that it is currently not possible to drop Puffin stats from Impala.
For this reason, it is possible to disable reading Puffin stats in two ways: For this reason, it is possible to disable reading Puffin stats in two ways:
<ul> <ul>