Yeah, this is unfortunately a hard limitation on the agg framework right now. All aggs convert the values to a double before operating on them. I'm not sure if that'll change, at least easily... it's a pretty fundamental part of how the agg framework operates.
Before anything else, I think it should be noted that losing precision usually isn't a problem for IDs and the like. 64bit doubles have 52 bit mantissa's, so the maximum non-float numeric you can store is 2^53. See this SO answer for a breakdown on why. So that gives you values 0-9 quadrillion (roughly) that can be stored without loss of precision. For most folks that's fine, but if you expect to have a max > 2^53 then it's a concern.
As a workaround, you could do a search with size:1 and sort by the value descending, which will give you the document with the highest value, and you can extract it from the source. Not a great solution, since it only really applies to min/max and not the other aggs, but it may help.
We've stayed away from BigDecimal so far because it's just really, really slow. The overhead of using it is just a no-go for performant aggs. If/when we want support for Longs (or other non-floats) I think we'd probably just integrate it as a framework feature... defaulting to non-float when applicable (min/max/sum is fine, avg never, etc)
Scientific notation doesn't (necessarily) mean that you are losing precision.
5.61380157E8 == 5.61380157 * 10⁸ == 561380157.0
561380157.0 is less than 2⁵³ (9007199300000000), so assuming you believe the SO answer that all values < 2⁵³ can be represented accurately, the number provided is 100% precise without any floating point errors.
I do realise that particular value doesn't, but I worry that if it's already breaking out scientific notation for a mere 9 digits, what will happen with 10, or 15...
You'd think that simply not formatting it like that would produce less questions. "Principle of least surprise", and all that.
No arguments there. The blame falls to Oracle though, not JSON per-se. When Java prints doubles to string, it converts to scientific notation above a certain threshold.
If m is less than 10^-3 or greater than or equal to 10^7, then it is represented in so-called "computerized scientific notation."
You should be safe to do a simple check for any value < 2⁵³ and conclude it's the actual, non-rounded value. And anything greater is potentially problematic. Not ideal I agree, but it is what it is.
The resolution was that it wasn't possible to fix cleanly without breaking compatibility for the java client. But now that the Java REST client is nearly done, and the transport client will be going away, we may be able to revisit this.
Hi all, actually I am trying to search a keyword in range query between particular dates. Can anyone help me?
curl -XGET 'http://localhost:9200/201706/_search' -d '{ "query" :{ "bool" : { "must" : [{ "match" : { "message": "isis"} }, { "range" :{ "created_at" :{ "gte" :"2017-06-01", "lte" : "2017-06-30", "format" :"yyyy/mm/dd||yyyy"}}}]}}}'
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.