Elasticsearch stats aggregation corrupt float number precision

Hi everyone.
I'm using stats aggregation (max, min, avg) over some documents with float number fields.
These float numbers are saved with two decimal point in documents (for example 69.43). But the stats aggregation output corrupts format of float numbers, so that it shows more decimal points.

I'm using java API, so that org.elasticsearch.search.aggregations.support.ValueType class includes only double format and does not have any option to specify precision.

How can I do that with the elasticsearch?I don't want to do it after manually.

May I ask what you mean by "corrupt"?

When you index a document with float, even if you only provide two decimal places in the JSON, the internal float may or may not have the same precision. Floats are stored using standard IEEE-754 floating point representation and not all values are able to be explicitly stored. For example, if you try to store the number 12.12 in an IEEE-754 floating point value, internally it is stored as 12.11999988555908203125 because that is the closest arrangement of mantissa and exponent.

This calculator is a convenient way to see how floats are stored: https://www.h-schmidt.net/FloatConverter/IEEE754.html

So that's part of the issue. The other side of the coin is that Elasticsearch aggregations always operate with Doubles, so all values (longs, bytes, shorts, floats, etc) are upconverted to a double. There are technical reasons and historical reasons for this decision, but that's how it is today.

If you only want to see two decimal places of precision, you can use the format parameter on the aggregation:

"format": "00.00"

Which should give you only two sigfigs after the decimal (I believe, typed that from memory).

Thank you for your response.
But neither of these two method format("00.00") and format("%.2f") had effect on the result:


Therefore, I wrote my own code after getting the results and used BigDecimal class to change precesion.

Ah, right. Sorry about that, format is only applied to the JSON output. When working in Java, we always provide the actual raw value.

Internally format just uses Java's DecimalFormat so you could do the same, or use something like BigDecimal.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.