Unable to index large geo_shapes in 2.0


(Brian Hudson) #1

We index some low resolution country polygons (they are still rather large). In ES 1.5.2 we were indexing these without issue with a precision of 1m, even on developer machines. They actually indexed rather quickly.

Having upgraded to 2.0, we aren't able to get these to index (ES 2.0 runs out of heap space, same amount allocated as in 1.5.2) even at 50m precision. With a precision of 100m they begin to index but it is still very slow (~20 minutes for Canada).

Were there significant changes to how geo_shapes are indexed in 2.0? Or perhaps some default setting that was changed? I reviewed the release notes for 2.0 but did not spot anything that I thought would effect this.


(Nick Knize) #2

Hi Brian_Hudson,

In 2.0 the default_error_pct defaults to 0 if precision or tree_levels is explicitly defined. (see distance_error_pct documentation for geo_shape field type) For large shapes this can consume quite a bit of memory. To "revert" to previous behavior explicitly set distance_error_pct to 0.025 (or larger if you're comfortable with inaccurate results on the shape boundary, or not querying with shapes at all).

Note also that this is being fixed at the Lucene level so you don't have to worry about heap memory regardless of precision or shape size.


(system) #3