Hi All,
I think maybe similar subject was brought up here before, however, I was not able to find a satisfactory answer from the past posts in this forum. So I create a post of my own.
I would like to state the constraint here first. The Elasticsearch index in discussion is at enterprise level, so as a data consumer, I will have very little control over the mapping of the index. Given this constraint, I would like to find a best solution to solve the lose of precision problem when performing aggregation sum on this index.
For the index in question, the decimal field "CADAmount" is mapped to FLOAT, and as you see per below, "714892.823215613" turned into "714892.8125" after aggregation. I understand that "714892.8125" might be the actual value got indexed(doc value) by the Elasticsearch, but I am unwilling to do a script sum with params._source.field1.field2..... because give 2 billion records on the index, the performance penalty will be significant.
My question is what will be the best solution give the constraint I have to ensure that I will have the aggregation sum precision of up to four decimal places (or at least 2 decimal place precision). Thanks in advance.
PUT index/_doc/1
{
"accHeader": {
"accDetail": [
{
"CADValue": 714892.823215613
},
{
"USDValue": 549917.556319702
}
]
}
}
GET index/_mapping
{
"mappings": {
"data": {
"properties": {
"accHeader": {
"properties": {
"accDetail": {
"CADValue": {
"type": "float"
}
}
}
}
}
}
}
}
POST index/_search
{
"size": 0,
"aggs": {
"value_CAD": {
"sum": {
"fields": "accHeader.accDetail.CADValue"
}
}
}
}
Query Output
{
"hits": {
"total": 1,
"max_score": 0,
"hits": []
},
"aggregations": {
"value_CAD": {
"value": 714892.8125
}
}
}