mirror of
https://github.com/apache/impala.git
synced 2026-02-02 06:00:36 -05:00
Before this patch, the NDV used for bloom filter sizing was based only on the cardinality of the build side. This is ok for FK/PK joins but can highly overestimate NDV if the build key column's NDV is smaller than the number of rows. This change takes the minimum of NDV (not changed by selectiveness) and cardinality (reduced by selectiveness). Testing: - Adjust test_bloom_filters and test_row_filters, raising the NDV of the test case such that the assertion is maintained. - Add 8KB bloom filter test case in test_bloom_filters. Change-Id: Idaa46789663cb2e6d29f518757d89c85ff8e4d1a Reviewed-on: http://gerrit.cloudera.org:8080/19506 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>