ElasticSearch aggregation shows unexpected result for SUM

Trying to apply sum aggregation and get unexpected result

1. prepare dataset

$cat products.json
{"index":{"_id":"1"}}
{"productId": 10,"shopId": 45,"prices": {"retailPrice": 525000000.02,"sumRetailPrice": 5250000000.2},"count": 10}
{"index":{"_id":"2"}}
{"productId": 10,"shopId": 48,"prices": {"retailPrice": 26250000004,"sumRetailPrice": 5250000000.8},"count": 20}

2. bulk insert

curl -XPOST localhost:9200/25products/_bulk -H "Content-Type: application/x-ndjson" --data-binary  @./products.json

3. view mapping

curl -XGET "http://localhost:9200/25products/_mapping?pretty"
{
  "25products" : {
    "mappings" : {
      "properties" : {
        "count" : {
          "type" : "long"
        },
        "prices" : {
          "properties" : {
            "retailPrice" : {
              "type" : "float"
            },
            "sumRetailPrice" : {
              "type" : "float"
            }
          }
        },
        "productId" : {
          "type" : "long"
        },
        "shopId" : {
          "type" : "long"
        }
      }
    }
  }
}

4. Sum field "prices.sumRetailPrice" in Painless

curl --location --request POST 'http://localhost:9200/25products/_search?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"aggs": {"sumSupplyPrice": {
    "sum": {"script": {
        "source": "(!doc.containsKey('\''prices.sumRetailPrice'\'') ? 0 : (doc['\''prices.sumRetailPrice'\''].size() == 0 ? 0: doc['\''prices.sumRetailPrice'\''].value))"
        }}
    }},
"query": {"bool": {
    "filter": [
        {"terms": {"shopId": [45]}},
        {"terms": {"productId": [10]}}
    ]
    }},
"from": 0, "size": 10
}'

result is

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "25products",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "productId" : 10,
          "shopId" : 45,
          "prices" : {
            "retailPrice" : 5.2500000002E8,
            "sumRetailPrice" : 5.2500000002E9
          },
          "count" : 10
        }
      }
    ]
  },
  "aggregations" : {
    "sumSupplyPrice" : {
      "value" : 5.249999872E9
    }
  }
}

4. Expectation
as well as I have a single record, expecting to have the same value as sumRetailPrice

"aggregations" : {
    "sumSupplyPrice" : {
      "value" : **5.2500000002E9**
    }
  }

But, actual result is not as expected.

"aggregations" : {
    "sumSupplyPrice" : {
      "value" : **5.249999872E9**
    }
  }

Where am I wrong?

Thanks!

Hi @VaZa Welcome to the community.

As an experiment try creating the mapping first with type double and check the results

Hi @stephenb , thanks for welcoming!
let me try it
1. index create

❯ curl -XPUT 'http://localhost:9200/25products/'
{"acknowledged":true,"shards_acknowledged":true,"index":"25products"}

2. set mapping

curl -XPUT 'http://localhost:9200/25products/_mapping' -H 'Content-Type: application/json'  -d '
{

        "properties": {
            "count": {"type": "long"},
            "prices": {"properties":{
                "retailPrice": {"type": "double"},
                "sumRetailPrice": {"type": "double"}
                }
            },
            "productId": {"type": "long"},
            "shopId": {"type": "long"}
        }

}
'
{"acknowledged":true}

3. insert data

curl -XPOST localhost:9200/25products/_bulk -H "Content-Type: application/x-ndjson" --data-binary  @./products.json
  1. make aggs
curl --location --request POST 'http://localhost:9200/25products/_search?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"aggs": {"sumSupplyPrice": {
    "sum": {"script": {
        "source": "(!doc.containsKey('\''prices.sumRetailPrice'\'') ? 0 : (doc['\''prices.sumRetailPrice'\''].size() == 0 ? 0: doc['\''prices.sumRetailPrice'\''].value))"
        }}
    }},
"query": {"bool": {
    "filter": [
        {"terms": {"shopId": [45]}},
        {"terms": {"productId": [10]}}
    ]
    }},
"from": 0, "size": 10
}'

5 .. and it works...as expected

"aggregations" : {
    "sumSupplyPrice" : {
      "value" : 5.2500000002E9
    }
  }

Thanks, Stephan
Well... I assume it works when we know our dataset, but when not?

my .dotnet code creates indexes on demand, when new shops appear, and each of them might have different field names and values for monetary figgures.

Is there any option to change field type from "float" to "double" without index re-creation?

Good.

No Mapping can not be changed after create BUT there is a solution.

You create and index Template and Define Dynamic Mappings that will create the right types for new fields based on the name etc .

Your mapping would be something like this... you can also you a conditions to match only certain field name patters... take a look... not the trivial concept but may be what you need

PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "my-doubles": {
          "match_mapping_type": "float",
          "mapping": {
            "type": "double"
          }
        }
      },

wow. I need to digest this a bit :slight_smile:
thanks for pointing direction, @stephenb
I'll come back with results

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.