Terms Aggregation skips dot(.) character

Hi All,

I am facing an issue with elastic search, but not able to fix it for last two days.

I have a document product with a field list dynamicAttributes.
e.g.
dynamicAttributeValues": [
"S",
"Blue Ridge",
"10.5",
"10",
"11.5",
"11.5",
"6.5",
"6",
"5",
"5.5",
"Large",
"XLarge"
]

I want to aggregate the result with doc_count which I am able to do. But the weird thing I noticed that terms aggregation is skipping the dot('.') character as the aggregation coming as below
{
"size": "65",
"count": 15437
},
{
"size": "13",
"count": 10679
},
{
"size": "105",
"count": 9651
},
{
"size": "55",
"count": 9359
}

we are getting 65 instead of 6.5, 105 instead of 10.5.... !!!!!

I googled, binged and searched everywhere but couldn't get a resolution to this.

Please help on this

Thanks
Jithin Kuriakose

The issue is actually with the mapping, not necessarily the terms aggregation. Terms aggs work on the post-analysis form of the the field. So in this case, the field likely has an analyzer that tokenizes on special characters (standard analyzer does this). So when 5.5 is tokenized by the analyzer, it gets turned into ["5", "5"] instead of "5.5".

So when the terms agg starts working on the data, it is seeing "5" individually and the periods are lost.

The solution is to map your field as a keyword instead of text. Keyword fields are not analyzed, so the token being aggregated is identical to the original token.

text fields are more for search, keyword more for aggs. Generally :slight_smile:

1 Like

Hi @polyfractal,
Thanks for your response.
The actual issue seems to be different for me.
I have added a new multi raw field, which is not analyzed. Now I am aggregating on productSkuDynamicAttributeValues.raw field
But still there are still some problems.

Below are my new mapping on the field.
"productSkuDynamicAttributeValues": {
"type": "string",
"fields": {
"high_precision": {
"type": "string",
"analyzer": "high_precision_analyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
},
"word_proximity": {
"type": "string",
"analyzer": "word_proximity_analyzer"
}
},
"analyzer": "high_recall_analyzer"
},

I have a value "0-6 months", but in aggregation I am getting "06"... !!!!

Could you please help me on this?

Thanks
Jithin

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.