Geo_shape vs geo_point performance

I'm interested in using the pre-indexed shape feature. In my case, I
anticipate that the geo_shape values for the documents will in fact be a
point, but in order to benefit from the pre-indexed shape feature, I need
to map them to geo_shape as opposed to geo_point.

The test consists of 5 documents:

  • 2 documents with a geo_point field
  • 2 documents with a geo_shape field with a point
  • 1 document with both

In all cases, I'm searching with a polygon that has approximate 300 points.
Note the big disparity in the times below.

Tests are as follow:

  • geo_shape filter with pre-indexed shape matches the geo_shape docs in
    ~ 8700 millis
  • geo_shape filter with the shape defined inline matches geo_shape
    docs in ~ 8600 millis
  • geo_polygon filter with the polygon defined inline matches the
    geo_point fields in ~ 2 millis.

I'm using the _cached=true and running the queries multiple times but get
similar results each time.

Any thoughts on why there's such a big disparity between the two types? I

--

Greetings,
The geo_shape and geo_point types use drastically different mechanisms for
querying. Quite simply the geo_point method is more efficient, since it is
optimized for documents that only use points; the geo_shape queries are
doing lots of extra work, assuming that they need to find 2-dimensional
shapes (which is not true in your test).

I assume your query polygon is pretty large? You will get significantly
faster geo_shape queries by setting a smaller tree_levels setting when you
create the mapping. The defaults are 24 for the default "geohash" tree and
12 for the "quadtree", so try cutting these in half. The spatial extent of
your query is inversely related to the optimal choice of tree_levels.

Even after optimizing tree_levels, the geo_point queries will probably
always remain faster. The best bet might be to lobby for the developers to
add an analogue to pre-indexed searches for geo_points.

In the meantime, if you do have success around optimizing tree_levels,
please share your findings and query extents with the group here!

On Thursday, November 29, 2012 8:08:13 AM UTC-8, massfords wrote:

I'm interested in using the pre-indexed shape feature. In my case, I
anticipate that the geo_shape values for the documents will in fact be a
point, but in order to benefit from the pre-indexed shape feature, I need
to map them to geo_shape as opposed to geo_point.

The test consists of 5 documents:

  • 2 documents with a geo_point field
  • 2 documents with a geo_shape field with a point
  • 1 document with both

In all cases, I'm searching with a polygon that has approximate 300
points. Note the big disparity in the times below.

Tests are as follow:

  • geo_shape filter with pre-indexed shape matches the geo_shape docs
    in ~ 8700 millis
  • geo_shape filter with the shape defined inline matches geo_shape
    docs in ~ 8600 millis
  • geo_polygon filter with the polygon defined inline matches the
    geo_point fields in ~ 2 millis.

I'm using the _cached=true and running the queries multiple times but get
similar results each time.

Any thoughts on why there's such a big disparity between the two types? I

--