I am getting a failure from a geo_distance query. It's returning hits but
also returning a failure status and message:
"failures" : [ {
"index" : "lidb",
"shard" : 4,
"status" : 500,
"reason" : "QueryPhaseExecutionException[[lidb][4]:
query[filtered(ConstantScore(NotDeleted(+cache(_type:locality)
+GeoDistanceFilter(location, ARC, 155.3427980593335, 32.0,
-117.0))))->cache(_type:locality)],from[0],size[10]: Query Failed [Failed
to execute main query]]; nested: StringIndexOutOfBoundsException[String
index out of range: -1]; "
} ]
Details follow:
I read the documentation and various examples and elasticsearch community
posts.
I had an index named "lidb" with about 297 documents, exploring the cool
behavior of the snowball analyzer, phrase matching, and my own table-based
synonym facility on top of ES (that's another, and also very happy, story).
All done so far using the Java API. But as I've discovered, the toString
method of the query builder and filter builder classes emit pretty JSON, so
it's surprisingly easy to write the Java code and quickly check it against
the JSON-based API documentation.
First off, my local ElasticSearch cluster (one node: my local laptop) has
the following configured analyzer. I imagine adding the "localtion" field
in one or more types to this default configuration:
index:
analysis:
analyzer:
# set stemming analyzer with no stop words as the default
default:
type: snowball
stopwords: none
filter:
stopWordsFilter:
type: stop
stopwords: none
Based on an example I found, I deleted the index (and all documents in
it???), then recreated the index and added the mapping for the "locality"
type (to hold cities, towns, and so on). All returned with success:
curl -XDELETE 'http://localhost:9200/lidb'
curl -XPUT 'http://localhost:9200/lidb'
curl -XPUT 'http://localhost:9200/lidb/locality/_mapping' -d '{ "locality"
: { "properties" : { "location" : {"type" : "geo_point"}}}}'
Next, I bulk-loaded the following tiny subset of data with "location": [
lon, lat ] according to the GeoJSON recommendations. The latitude and
longitude fields are left over from a conversion artifact; I'll be removing
them. But otherwise, the bulk load was successful (no surprises there):
{ "index" : { "_index" : "lidb", "_type" : "subscriber", "_id" :
"8004441616" } }
{ "telno" : "8004441616", "cnam" : "SUNNY SUNGLASS", "o" : "Sunny's
Sunglasses", "city" : "San Dimas", "state" : "CA", "latitude" : 34.102908,
"longitude" : -117.816249, "location" : [ -117.816249, 34.102908 ] }
{ "index" : { "_index" : "lidb", "_type" : "subscriber", "_id" :
"8004442626" } }
{ "telno" : "8004442626", "cnam" : "RAINY SUNGLASS", "o" : "Rainy's
Sunglasses", "city" : "San Dimas", "state" : "CA", "latitude" : 34.102908,
"longitude" : -117.816249, "location" : [ -117.816249, 34.102908 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "1" } }
{ "city" : "Abbeville", "state" : "AL", "latitude" : 31.566367, "longitude"
: -85.251300, "location" : [ -85.251300, 31.566367 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "2" } }
{ "city" : "San Carlos", "state" : "CA", "latitude" : 37.499187,
"longitude" : -122.263278, "location" : [ -122.263278, 37.499187 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "3" } }
{ "city" : "San Clemente", "state" : "CA", "latitude" : 33.437828,
"longitude" : -117.620397, "location" : [ -117.620397, 33.437828 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "4" } }
{ "city" : "Sand City", "state" : "CA", "latitude" : 36.614759, "longitude"
: -121.850060, "location" : [ -121.850060, 36.614759 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "5" } }
{ "city" : "San Diego", "state" : "CA", "latitude" : 32.779541, "longitude"
: -117.146344, "location" : [ -117.146344, 32.779541 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "6" } }
{ "city" : "San Diego Country Estates", "state" : "CA", "latitude" :
33.002636, "longitude" : -116.799005, "location" : [ -116.799005, 33.002636
] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "7" } }
{ "city" : "San Dimas", "state" : "CA", "latitude" : 34.102908, "longitude"
: -117.816249, "location" : [ -117.816249, 34.102908 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "8" } }
{ "city" : "San Fernando", "state" : "CA", "latitude" : 34.287251,
"longitude" : -118.438836, "location" : [ -118.438836, 34.287251 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "9" } }
{ "city" : "San Francisco", "state" : "CA", "latitude" : 37.759881,
"longitude" : -122.437392, "location" : [ -122.437392, 37.759881 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "10" } }
{ "city" : "San Gabriel", "state" : "CA", "latitude" : 34.094176,
"longitude" : -118.098449, "location" : [ -118.098449, 34.094176 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "11" } }
{ "city" : "New York Mills", "state" : "MN", "latitude" : 46.519423,
"longitude" : -95.373026, "location" : [ -95.373026, 46.519423 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "12" } }
{ "city" : "West New York", "state" : "NJ", "latitude" : 40.788400,
"longitude" : -74.013090, "location" : [ -74.013090, 40.788400 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "13" } }
{ "city" : "Albuquerque", "state" : "NM", "latitude" : 35.110703,
"longitude" : -106.609991, "location" : [ -106.609991, 35.110703 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "14" } }
{ "city" : "New York", "state" : "NY", "latitude" : 40.704234, "longitude"
: -73.917927, "location" : [ -73.917927, 40.704234 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "15" } }
{ "city" : "New York Mills", "state" : "NY", "latitude" : 43.102569,
"longitude" : -75.292105, "location" : [ -75.292105, 43.102569 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "16" } }
{ "city" : "Niagara Falls", "state" : "NY", "latitude" : 43.094305,
"longitude" : -79.017339, "location" : [ -79.017339, 43.094305 ] }
{ "index" : { "_index" : "lidb", "_type" : "locality", "_id" : "17" } }
{ "city" : "Yoder", "state" : "WY", "latitude" : 41.917560, "longitude" :
-104.295060, "location" : [ -104.295060, 41.917560 ] }
So I tried out my very first geo_distance query. It returned 3 hits. I have
to go back and manually calculate all the distances to verify the accuracy,
but for now I trust ES (of course).
But what is odd is that it also returned a failure exception:
curl -XGET 'http://localhost:9200/lidb/locality/_search?pretty=true' -d '{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "250km",
"locality.location": {
"lat": 32,
"lon": -117
}
}
}
}
}
}'
And here is the response:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 4,
"failed" : 1,
"failures" : [ {
"index" : "lidb",
"shard" : 4,
"status" : 500,
"reason" : "QueryPhaseExecutionException[[lidb][4]:
query[filtered(ConstantScore(NotDeleted(+cache(_type:locality)
+GeoDistanceFilter(location, ARC, 155.3427980593335, 32.0,
-117.0))))->cache(_type:locality)],from[0],size[10]: Query Failed [Failed
to execute main query]]; nested: StringIndexOutOfBoundsException[String
index out of range: -1]; "
} ]
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "lidb",
"_type" : "locality",
"_id" : "5",
"_score" : 1.0, "_source" : { "city" : "San Diego", "state" : "CA",
"latitude" : 32.779541, "longitude" : -117.146344, "location" : [
-117.146344, 32.779541 ] }
}, {
"_index" : "lidb",
"_type" : "locality",
"_id" : "6",
"_score" : 1.0, "_source" : { "city" : "San Diego Country Estates",
"state" : "CA", "latitude" : 33.002636, "longitude" : -116.799005,
"location" : [ -116.799005, 33.002636 ] }
}, {
"_index" : "lidb",
"_type" : "locality",
"_id" : "7",
"_score" : 1.0, "_source" : { "city" : "San Dimas", "state" : "CA",
"latitude" : 34.102908, "longitude" : -117.816249, "location" : [
-117.816249, 34.102908 ] }
} ]
}
}
But the health is normal. I expect yellow: This is running on a local
MacBook with the default of 5 shards and 1 replica. There is also a few
twitter documents that have been added during my exploration of the
examples; those weren't modified as they were in another index:
$ curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
"cluster_name" : "brian-exploration",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 10,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 10
}
Of course, I'll completely erase the data subdirectory and restart and try
this again from a clean start. But this particular failure might indicate a
problem, and I'll leave my setup alone if you need more details from it.
--