Filtered Query With GeoDistanceFilter causing OutOfMemoryError, even when query returns 0 documents


(Scott Howlett) #1

Hi all -

I have a document set of ~100,000 items. Each document has a
geometry.location (a single mapped GeoPoint) and a geometry.coordinates (an
array of mapped GeoPoints). On average there may be 100 GeoPoints in the
geometry.coordinates array (so 10,000,000 GeoPoints in total). I
understand that executing a match_all query with a GeoDistanceFilter
against geometry.coordinates may require loading all the 10,000,000
geopoints into memory.

To avoid this, I'm attempting to execute a filtered query which I
understand should execute the query first and the filter second. Thus, if
the # of documents returned from the query are say 10, then this would mean
the GeoDistance filter executing across 1,000 GeoPoints (10 documents x 100
GeoPoints per document). However, every time I do this I get an OOM
exception, even when the query returns 0 documents.

Speculation: It seems that regardless of the query, all the GeoPoints in
geometry.coordinates are attempted to load into memory.

Example I'm using in ES head: (note the bool query for user:kimchy returns
0 documents).

{
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"term": {
"user": "kimchy"
}
}
}
},
"filter": {
"geo_distance": {
"distance": "200km",
"geometry.coordinates": {
"lat": 43,
"lon": -79
}
}
}
}
}
}

Other tidbits:

  • ES 19.9.
  • Java 7 Update 9
  • Windows 7, 64 bit
  • 8GB memory allocated to ES
  • 16GB memory total

Any help or suggestions are much appreciated!

--


(system) #2