Hi,
I was wondering if there are any plugins / methods for calculating the similarity of spatial features. I'm planning to build a recommendation system for spatial data. The data I have consists of points, polygons and linestrings.
I know there is the geo_shape query, and i think the intersects relation is a good start to find related documents. However, there are cases where e.g. two features are very close but do not intersect. Additionally, I would like to have some sort of ranking.
The geohash aggregation is also interesting, as similar geohashes mean the data is from the same bucket. But there are also edge cases where two features are related but not in the same bucket:
Nearby locations generally have similar prefixes, though not always: there are edge-cases straddling large-cell boundaries; in France, La Roche-Chalais (u000) is just 30km from Pomerol (ezzz). A reliable prefix search for proximate locations will also search prefixes of a cell’s 8 neighbours. (e.g. a database query for results within 30-odd kilometres of Pomerol would be SELECT * FROM MyTable WHERE LEFT(Geohash, 4) IN ('ezzz', 'gbpb, 'u000', 'spbp', 'spbn', 'ezzy', 'ezzw', 'ezzx', 'gbp8'). Whether this would offer significant (or any) performance gains over a latitude/longitude bounding box search I’ve yet to check.
From http://www.movable-type.co.uk/scripts/geohash.html
Best,
lukas