IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Reviewed-on: http://gerrit.cloudera.org:8080/17186
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Fucun Chu
2021-03-12 16:59:05 +08:00
committed by Impala Public Jenkins
parent 8f8668aaf0
commit 77d6acd032
6 changed files with 157 additions and 13 deletions

View File

@@ -1009,6 +1009,8 @@ visible_functions = [
'_ZN6impala21DataSketchesFunctions14DsThetaExcludeEPN10impala_udf15FunctionContextERKNS1_9StringValES6_'],
[['ds_theta_union_f'], 'STRING', ['STRING', 'STRING'],
'_ZN6impala21DataSketchesFunctions13DsThetaUnionFEPN10impala_udf15FunctionContextERKNS1_9StringValES6_'],
[['ds_theta_intersect_f'], 'STRING', ['STRING', 'STRING'],
'_ZN6impala21DataSketchesFunctions17DsThetaIntersectFEPN10impala_udf15FunctionContextERKNS1_9StringValES6_'],
[['ds_kll_quantile'], 'FLOAT', ['STRING', 'DOUBLE'],
'_ZN6impala21DataSketchesFunctions13DsKllQuantileEPN10impala_udf15FunctionContextERKNS1_9StringValERKNS1_9DoubleValE'],
[['ds_kll_n'], 'BIGINT', ['STRING'],