Hi all -
I have a document set of ~100,000 items.  Each document has a
geometry.location (a single mapped GeoPoint) and a geometry.coordinates (an
array of mapped GeoPoints).  On average there may be 100 GeoPoints in the
geometry.coordinates array (so 10,000,000 GeoPoints in total).  I
understand that executing a match_all query with a GeoDistanceFilter
against geometry.coordinates may require loading all the 10,000,000
geopoints into memory.
To avoid this, I'm attempting to execute a filtered query which I
understand should execute the query first and the filter second.  Thus, if
the # of documents returned from the query are say 10, then this would mean
the GeoDistance filter executing across 1,000 GeoPoints (10 documents x 100
GeoPoints per document).  However, every time I do this I get an OOM
exception, even when the query returns 0 documents.
Speculation: It seems that regardless of the query, all the GeoPoints in
geometry.coordinates are attempted to load into memory.
Example I'm using in ES head: (note the bool query for user:kimchy returns
0 documents).
{
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"term": {
"user": "kimchy"
}
}
}
},
"filter": {
"geo_distance": {
"distance": "200km",
"geometry.coordinates": {
"lat": 43,
"lon": -79
}
}
}
}
}
}
Other tidbits:
- ES 19.9.
 - Java 7 Update 9
 - Windows 7, 64 bit
 - 8GB memory allocated to ES
 - 16GB memory total
 
Any help or suggestions are much appreciated!
--