Need help on "sum by group" aggregations on simple doc structure

I'm using es 7.8.1, my data are pojo records like this:

    {
        "timestamp": "2020-11-07 08:00:00",
        "lineId": 532,
        "sn": "A2600101000005DF",
        "upBytes": 5927,
        "dlBytes": 13706
    }

I was trying to sum my records like the sql below:

    select sum(upBytes) as totalUpBytes, sum(dlBytes) as totalDlBytes, lineId from traffic
    where timestamp between 11111 and 2222
    group by sn

I'm using spring, here's the actual request that spring elasticsearch generates:

    {
        "size": 0,
        "query": {
            "bool": {
                "must": [
                    {
                        "range": {
                            "timestamp": {
                                "from": 1603260000,
                                "to": 1604838407,
                                "include_lower": true,
                                "include_upper": false,
                                "boost": 1
                            }
                        }
                    }
                ],
                "adjust_pure_negative": true,
                "boost": 1
            }
        },
        "sort": [
            {
                "_seq_no": {
                    "order": "desc"
                }
            }
        ],
        "aggregations": {
            "by_sn": {
                "terms": {
                    "field": "sn",
                    "size": 10,
                    "min_doc_count": 1,
                    "shard_min_doc_count": 0,
                    "show_term_doc_count_error": false,
                    "order": [
                        {
                            "_count": "desc"
                        },
                        {
                            "_key": "asc"
                        }
                    ]
                },
                "aggregations": {
                    "totalUpBytes": {
                        "sum": {
                            "field": "upBytes"
                        }
                    },
                    "totalDlBytes": {
                        "sum": {
                            "field": "dlBytes"
                        }
                    }
                }
            }
        }
    }

it gives me error:

Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [sn] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]

I checked the index mapping, the "sn" field was auto mapped as "text" instead of "keyword". (actually that index has no specific mapping during creation). so I tried changing "sn" to "sn.keyword", another exception came up:

class org.elasticsearch.search.aggregations.bucket.terms.ParsedStringTerms cannot be cast to class org.elasticsearch.search.aggregations.metrics.Sum (org.elasticsearch.search.aggregations.bucket.terms.ParsedStringTerms and org.elasticsearch.search.aggregations.metrics.Sum are in unnamed module of loader 'app')

which is more obscure, and I don't quite understand what happening...

any suggestion?
And, what's the better practice for such requirement?

=====================

here's the mapping:

    {
        "traffic": {
            "mappings": {
                "properties": {
                    "dlBytes": {
                        "type": "long"
                    },
                    "lineId": {
                        "type": "long"
                    },
                    "session": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "sn": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "timestamp": {
                        "type": "long"
                    },
                    "upBytes": {
                        "type": "long"
                    }
                }
            }
        }
    }

This happened when there are two data type(string and long) on your sum field(upBytes or dlBytes). Did you search on multiple indices which contain same field name but diferent data types?

I dumped the records without aggregation, then checked them against index mapping. all records were correctly mapped.

so i guess maybe the 2nd exception is about the incorrect usage of embedded aggregations? i'm not quite sure about this part, i copied it from another thread here:

 "aggregations": {
            "by_sn": {
                "terms": {
                    "field": "sn",
                    "size": 10,
                    "min_doc_count": 1,
                    "shard_min_doc_count": 0,
                    "show_term_doc_count_error": false,
                    "order": [
                        {
                            "_count": "desc"
                        },
                        {
                            "_key": "asc"
                        }
                    ]
                },
                "aggregations": {
                    "totalUpBytes": {
                        "sum": {
                            "field": "upBytes"
                        }
                    },
                    "totalDlBytes": {
                        "sum": {
                            "field": "dlBytes"
                        }
                    }
                }
            }
        }

No, this aggregation can work in my test.

I really appreciate your verification, it confirmed my aggregation plan was basically correct.
So I debug my code line by line, finally I found the trap.
It wasn't the query or aggregation that causes the exception, but the incorrect response parsing.
I was populating the term aggregation result as a sum result. (What?!!)

Finally I made it work! Thanks again!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.