The geo_distance filter is specified as part of a "filter" element like so:
{
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"term": {
"foo": "bar"
}
}
]
}
},
"filter": [
{
"geo_distance": {
"distance": 0.3728226,
"location": {
"lat": 40.797307367399654,
"lon": -74.30757522583008
}
}
}
]
}
}
}
If you're not using facets it is beneficial to put the geo_distance
filter as a top level filter in the your search request.
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.1) (6b22-1.10.1-0ubuntu1)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
Actually this is a very old Java version. I recommend upgrading to the
latest Java 7 release (update 10) or at least the lastest Java 6
(update 38) release. This will improve the performance and stability
of your cluster.
As you can see the cache size is 1.7GB. At this rate we'd need 10 of these
servers to host 10x the number of records under the current schema. This
would cost around 6K per month on AWS, which is just not tenable. (I'm
assuming heap_used is high simply because Java hasn't needed to GC)
The geo_distance filter relies on the fact that all geo points are
loaded into memory. There is no way around that at the moment.
The heap_used is higher because it includes unreferenced objects.
After a full gc the heap_used should be much lower.
If we switch to a geo_shape, would that mean we don't need as much memory?
Yes. The geo_shape filter doesn't rely on the fielddata cache.
Martijn
--