I have been trying out the sum aggregation since quite a while. I have figured out a query that works perfectly fine for me otherwise, but sometimes, due to reasons unknown, returns ridiculously small values.
I have used the following mapping to index my data:
The "score" field only contains values between 1-10, but somehow my aggregation returns a value of 1.4e-322, 1.3e-322, etc.
I wonder why this is happening, and I haven't been able to figure out a reason for it as well. I would be very grateful if someone could help me out with this issue.
Basically this will add a range aggregation alongside your sum aggregation which will put documents with a score value <1 into one bucket and ones with score >=1 in another. Then for each of these buckets it outputs the top 10 documents. Hopefully this will show if any of the documents contain unexpected values (values outside of the 1-10 range).
Another thing to do would be to check the resulting mappings in your index by running:
If you can paste the result of that into a gist and link it here it will give us an idea of what the mapping currently is on that index for all relevant fields (please do not paste a large mapping directly in a reply as it makes the post very hard to read).
Unfortunately this is a known issue with dynamic mappings on 1.x. If two of your shards get a document at the same time, and one is mapped as a long and the other one as a double, then the master node will reject the 2nd mapping which is applied, but the shard with the wrong mapping will continue to index documents using the wrong type. Usually, the problem becomes visible when nodes are moved around, because then elasticsearch starts either interpreting some double bits as longs (and you would see huge number) or long bits as double (and you would typically see tiny numbers, like here). When hit by this bug, there is not other choice but to reindex. One way to prevent it from happening again in the future would be to configure mappings explicitely.
On 2.0, this issue will be fixed as dynamic mappings will have to be validated on the master node first before being applied. You can look at https://github.com/elastic/elasticsearch/pull/10634 for more information.
Thanks a lot Adrien! That was a really good description and really very helpful. Yes, you're right, on other doc-types that I have, I have seen the really large numbers as well.
Well, I guess I'd be the happiest if this issue is fixed in 2.0, since in a few doc_types and other indexes, I've got really a large number of keys in the document, for which configuring mappings explicitly would be pretty difficult. Though I will try and figure out some way of mapping it explicitly. Thanks a lot!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.