IMPALA-14573: port critical geospatial functions to c++ (part 1)

This commit contains the simpler parts from
https://gerrit.cloudera.org/#/c/20602

This mainly means accessors for the header of the binary
format and bounding box check (st_envIntersects).
New tests for not yet covered functions / overloads are also added.

For details of the binary format see be/src/exprs/geo/shape-format.h

Differences from the PR above:

Only a subset of functions are added. The criteria was:
1. the native function must be fully compatible with the Java version*
2. must not rely on (de)serializing the full geometry
3. the function must be tested

1 implies 2 because (de)serialization is not implemented yet in
the original patch for >2d geometries, which would break compatibility
for the Java version for ZYZ/XYM/XYZM geometries.

*: there are 2 known differences:
 1. NULL handling: the Java functions return error instead of NULL
    when getting a NULL parameter
 2. st_envIntersects() doesn't check if the SRID matches - the Java
    library looks inconsistant about this

Because the native functions are fairly safe replacements for the Java
ones, they are always used when geospatial_library=HIVE_ESRI.

Change-Id: I0ff950a25320549290a83a3b1c31ce828dd68e3c
Reviewed-on: http://gerrit.cloudera.org:8080/23700
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Csaba Ringhofer
2025-11-20 20:57:25 +01:00
committed by Impala Public Jenkins
parent fe41448780
commit 780e6683a2
15 changed files with 1029 additions and 9 deletions

View File

@@ -22,6 +22,7 @@ from tests.common.custom_cluster_test_suite import CustomClusterTestSuite
from tests.common.skip import SkipIfApacheHive
ST_POINT_SIGNATURE = "BINARY\tst_point(STRING)\tJAVA\ttrue"
ST_X_SIGNATURE_BUILTIN = "DOUBLE\tst_x(BINARY)\tBUILTIN\ttrue"
SHOW_FUNCTIONS = "show functions in _impala_builtins"
@@ -34,9 +35,11 @@ class TestGeospatialLibrary(CustomClusterTestSuite):
def test_disabled(self):
result = self.execute_query(SHOW_FUNCTIONS)
assert ST_POINT_SIGNATURE not in result.data
assert ST_X_SIGNATURE_BUILTIN not in result.data
@SkipIfApacheHive.feature_not_supported
@pytest.mark.execute_serially
def test_enabled(self):
result = self.execute_query(SHOW_FUNCTIONS)
assert ST_POINT_SIGNATURE in result.data
assert ST_X_SIGNATURE_BUILTIN in result.data

View File

@@ -18,14 +18,29 @@
from __future__ import absolute_import, division, print_function
from tests.common.impala_test_suite import ImpalaTestSuite
from tests.common.skip import SkipIfApacheHive
from tests.common.test_dimensions import create_single_exec_option_dimension
class TestGeospatialFuctions(ImpalaTestSuite):
@classmethod
def add_test_dimensions(cls):
super(TestGeospatialFuctions, cls).add_test_dimensions()
cls.ImpalaTestMatrix.add_dimension(create_single_exec_option_dimension())
# Tests do not use tables at the moment, skip other fileformats than Parquet.
cls.ImpalaTestMatrix.add_constraint(lambda v:
v.get_value('table_format').file_format == 'parquet')
"""Tests the geospatial builtin functions"""
@SkipIfApacheHive.feature_not_supported
def test_esri_geospatial_functions(self, vector):
# tests generated from
# https://github.com/Esri/spatial-framework-for-hadoop/tree/master/hive/test
self.run_test_case('QueryTest/geospatial-esri', vector)
# manual tests added
self.run_test_case('QueryTest/geospatial-esri-extra', vector)
@SkipIfApacheHive.feature_not_supported
def test_esri_geospatial_planner(self, vector):
# These tests are not among planner tests because with default flags
# geospatial builtin functions are not loaded.