Geo_shape questions

I posted a message yesterday, but somehow it didn't get on the list. Trying
again....

I have some questions/remarks about the (incredibly useful) geo_shape type
and filters.

  • Are there plans to support "Pre-Indexed-Shapes" also in documents, i.e.
    specify a pre-indexed shape to be indexed with a new document, instead of
    adding the geometry itself to that document? I would expect that in many
    use-cases the same geometries will be indexed with multiple docs. Just like
    with filters/queries the performance could benefit quite a lot if the
    indexer could just copy the hashes over from an already indexed geometry.

  • Imho allowing a serialisation of a geometry as e.g. WKT would not only
    trim-down on the size of documents, but also on the work that elasticsearch
    needs to do for serializing/deserializing geometries. Polygons quickly
    become really big when expressed in JSON... Is this something that is
    considered and/or that will be accepted when provided in a decent
    pull-request?

  • A bit more documentation about how the combination of distance_error_pct
    and tree_levels affects the precision/results of filters would really be
    appreciated. From the docs and code I'm having a hard time understanding
    the consequences of altering both values on filters and indexes. What
    exactly does distance_error_pct, and how does it affect e.g. an
    intersection filter?

  • Quote from the docs: "Because of current limitations of the algorithm,
    very large indexed shapes are not deemed to intersect with very small
    filter shapes". Are there any plans to fix this? Assuming the algorithmic
    problem is that large shapes are only hashed up to a maximum depth, there
    are a couple of ways to fix this. E.g. the indexer could add an extra field
    with the hashes from only the deepest hash-level it uses for that geometry.
    The intersection filter could use this by extending (boolean or) the
    current filter with a term-filter on that field for all parents of the
    hashes it currently uses for searching. That way larger shapes that
    intersect will be included, and smaller shapes that only happen to share a
    parent won't be included.

  • The algorith for "within" (In TermQueryPrefixStrategy) could be improved
    (imho). It's currently inconsistent for geometries that are equal or just a
    tiny bit smaller than the filter-geometry, and I think that could easily be
    fixed. I've filed ann issue about this yesterday, so I won't get further
    into it here, see https://github.com/elasticsearch/elasticsearch/issues/2552

Thanks!

--