Weird aggregation issue


(Emanuel Buzek) #1

Hi,

we ran into a weird problem with aggregations on our production cluster (3 nodes all replicating ~60G index).

Problem: aggregation (sum, min, max - all behaving the same) of one field (called 'dobirka') returns about 70% of the time totally incorrect results. Aggregations on other fields work as expected. Does this suggest some index corruption?

My query:
curl -XPOST "http://selen:9200/kolos-index/VyhledavaniCesta/_search" -d ' { "query": {
"filtered" : {
"filter" : {
"term" : { "immutableId" : "701546403" }
}
}
},
"_source": "dobirka",
"aggs":{"MAX-VAL":{"max":{"field":"dobirka"}}}}
'

Correct output, which I get about 30% of the time:
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"kolos-index","_type":"VyhledavaniCesta","_id":"701546403","_score":1.0,"_source":{"dobirka":25752.0}}]},"aggregations":{"MAX-VAL":{"value":25752.0,"value_as_string":"25752.0"}}}

Incorrect output:

{"took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"kolos-index","_type":"VyhledavaniCesta","_id":"701546403","_score":1.0,"_source":{"dobirka":25752.0}}]},"aggregations":{"MAX-VAL":{"value":4.6728078698154557E18,"value_as_string":"4.6728078698154557E18"}}}

Note that the query filters out one document only (see output - { "total": 1 }). The field is mapped as 'double', I checked that in the mapping. The value of the field is clearly visible in the _source, 25752.0.

Cloning the index to our test cluster leads to correct results 100% of the time.

I've tried restarting our nodes one by one, but it doesn't help.

Clues will be greatly appreciated! Thanks!

EDIT: just to add, all our nodes are running ES 1.5.0

-Emanuel


(Colin Goodheart-Smithe) #3

I think you are hitting the same issue as described in this post: Sum Aggregation returning very small, unrelated values

Unfortunately this is a known bug in 1.x and has been fixed in the upcoming 2.0 release


(Emanuel Buzek) #4

Thank you, it looks like I am hitting the same issue.

However, I don't understand how it happened, since the aggregations used to work correctly and we were indexing double values from the beginning. All nodes report show 'double' mapping for this field.

Also, why is the aggregation result incorrect only sometimes? This would suggest only some shards are corrupted, but turning off individual nodes does not help.


(system) #5