Double Multi-field Loses Precision

It seems double field under a float property loses precision in ES8. What changed on ES8? Is there a way to configure it (mapping/index/cluster) so it doesn't lose precision?

I tested with the following index

PUT test.double
{
  "mappings": { 
    "properties": {
      "amount" : {
        "type" : "float",
        "fields": {
          "d": {
            "type": "double"
          }
        }
      }
    }
  }
}

post some sample data

POST test.double/_bulk
{ "index" : { "_id" : "1" } }
{ "amount" : 856250000 }
{ "index" : { "_id" : "2" } }
{ "amount" : 856250000 }
{ "index" : { "_id" : "3" } }
{ "amount" : 856250000 }
{ "index" : { "_id" : "4" } }
{ "amount" : 856250000 }

run the following aggregation

GET test.double/_search
{
  "size": 0, 
  "aggs": {
    "float": {
      "sum": { "field": "amount"}
    },
    "float.d": {
      "sum": { "field": "amount.d"}
    }
  }
}

it produces different result on ES7 vs ES8.
in ES7 (correct)

  "aggregations" : {
    "float.d" : {
      "value" : 3.425E9
    },
    "float" : {
      "value" : 3.424999936E9
    }
  }

in ES8 (incorrect)

  "aggregations" : {
    "float.d" : {
      "value" : 3.424999936E9
    },
    "float" : {
      "value" : 3.424999936E9
    }
  }

I've tested with several other combination on ES8

  • property is keyword, field is float and double > correct result
  • property is integer, field is float and double > double is correct but float field returns integer value (test data has decimal point)

it seems float type has some issues

Hey @dna01,

I came across this issue before. Please check this: Elasticsearch "data": { "type": "float" } returns incorrect results

On ES9 the output is missing the scientific notation. But I cut and paste exactly what you gave into kibana 9's DevTools and I get

  "aggregations": {
    "float": {
      "value": 3424999936
    },
    "float.d": {
      "value": 3425000000
    }

This looks correct to me.

double precision is able to maintain all the precision, and clearly 4* 856250000 is indeed 3425000000.

The single precision is unable to retain the precision of "856250000", usually get 7/8 significant digits, so the IEEE bit representation of "856250000" is the same as the IEEE bit representation of 856249984 (and a range of other integers/floats around that point on the number line). sum of 4 856249984s (in single or double precision) is 3424999936, which is what is given.

So, ES9 seems to give right answer. Which specific 8.x version are you (@dna01) using ?

see also similar discussion in this thread.

I tested on v8.10.4

I understand the precision issue with float data type. I just wasn’t aware that it affected the double sub field.

I assume has been fixed again on v9. I dont have v9 env just yet so I cannot test it, but thank you for the info.

I understand the precision issue with float data type. This particular issue is more on the double field that somehow also lost precision when defined under float.
And, separate issue, float field got truncated when under defined integer property.

I was curious, so I spun up a 8.10.4 instance on my (apple silicon) Mac and with Edge browser I see:

  "aggregations": {
    "float.d": {
      "value": 3425000000
    },
    "float": {
      "value": 3424999936
    }
  }

using 8.10.4 too, i.e. what I consider to be the correct response and consistent with 9.x and 8.11.1 (I had a handy instance) too?

What tools are you using to generate the "wrong" results? Kibana, curl, postman, ... ? If using DevTools, any browser plugins or any tuning of kibana ? I'm now not so sure there is any elasticsearch bug here, maybe in some other tool. I have vague recollection of some json tooling/library/... having some issues around losing precision.

Sorry, I think I put incorrect sample. I got confused when testing different combination. Regarding the different output format, I dont remember which env I copied it from, just disregard that output.

Since I cannot update the original post, here is the test payload and output (the other steps are the same). I use Kibana with Chrome browser for these testing, no extensions.

POST test.double/_bulk
{"index":{}}
{ "amount": 5.2 }
{"index":{}}
{"amount": 5.8}
{"index":{}}
{"amount": 5.1}
{"index":{}}
{"amount": 5.6}
{"index":{}}
{"amount": 4.2}
{"index":{}}
{ "amount": 4.0}

on v7 (Windows 11)

  "aggregations" : {
    "float.d" : {
      "value" : 29.9
    },
    "float" : {
      "value" : 29.899999618530273
    }
  }

on v8, tested on 8.10.4 (Ubuntu 20.04.6 LTS) & 8.15.2 (Windows 11)

  "aggregations": {
    "float.d": {
      "value": 29.899999618530273
    },
    "float": {
      "value": 29.899999618530273
    }
  }

OK.

So I see the same as you saw on 9.x. You actually just need one doc, with a "helpful" value, to reproduce. Also just quote-ing the input on 9.x generates the "expected" result.

But, assuming your report on 7.x is accurate, this would indeed be a change in behavior from 7.x. It seems only relevant in multi-field, by which I mean my very brief attempts to reproduce in other scenarios failed. Here's my minimal reproducing steps for illustration purposes:

DELETE /floatingpointishard

PUT /floatingpointishard
{
  "mappings": {
    "properties": {
      "par": {
        "type": "float",
        "fields": {
          "d": {
            "type": "double"
          },
          "f": {
            "type": "float"
          }
        }
      }
    }
  }
}

POST floatingpointishard/_bulk
{ "index" : { "_id" : "1" } }
{ "par" :  "5.1" }

GET floatingpointishard/_search
{
  "size": 0, 
  "aggs": {
    "f": {
      "sum": { "field": "par.f"}
    },
    "d": {
      "sum": { "field": "par.d"}
    },
    "par": {
      "sum": { "field": "par"}
    }
  }
}

Note the value "5.1" is quoted in the _bulk indexing call. This produces:

  "aggregations": {
    "f": {
      "value": 5.099999904632568
    },
    "d": {
      "value": 5.1
    },
    "par": {
      "value": 5.099999904632568
    }
  }

And compare with:

DELETE /floatingpointishard

PUT /floatingpointishard
{
  "mappings": {
    "properties": {
      "par": {
        "type": "float",
        "fields": {
          "d": {
            "type": "double"
          },
          "f": {
            "type": "float"
          }
        }
      }
    }
  }
}

POST floatingpointishard/_bulk
{ "index" : { "_id" : "1" } }
{ "par" :  5.1 }

GET floatingpointishard/_search
{
  "size": 0, 
  "aggs": {
    "f": {
      "sum": { "field": "par.f"}
    },
    "d": {
      "sum": { "field": "par.d"}
    },
    "par": {
      "sum": { "field": "par"}
    }
  }
}

Note the value 5.1 was not quoted in bulk indexing call this time. This produces:

  "aggregations": {
    "f": {
      "value": 5.099999904632568
    },
    "d": {
      "value": 5.099999904632568
    },
    "par": {
      "value": 5.099999904632568
    }
  }

I doubt I can help further here, this would need someone who knows the actual code paths to fully clarify. I hope they do, it's an interesting find.