I am facing an issue with elastic search, but not able to fix it for last two days.
I have a document product with a field list dynamicAttributes.
e.g.
dynamicAttributeValues": [
"S",
"Blue Ridge",
"10.5",
"10",
"11.5",
"11.5",
"6.5",
"6",
"5",
"5.5",
"Large",
"XLarge"
]
I want to aggregate the result with doc_count which I am able to do. But the weird thing I noticed that terms aggregation is skipping the dot('.') character as the aggregation coming as below
{
"size": "65",
"count": 15437
},
{
"size": "13",
"count": 10679
},
{
"size": "105",
"count": 9651
},
{
"size": "55",
"count": 9359
}
we are getting 65 instead of 6.5, 105 instead of 10.5.... !!!!!
I googled, binged and searched everywhere but couldn't get a resolution to this.
The issue is actually with the mapping, not necessarily the terms aggregation. Terms aggs work on the post-analysis form of the the field. So in this case, the field likely has an analyzer that tokenizes on special characters (standard analyzer does this). So when 5.5 is tokenized by the analyzer, it gets turned into ["5", "5"] instead of "5.5".
So when the terms agg starts working on the data, it is seeing "5" individually and the periods are lost.
The solution is to map your field as a keyword instead of text. Keyword fields are not analyzed, so the token being aggregated is identical to the original token.
text fields are more for search, keyword more for aggs. Generally
Hi @polyfractal,
Thanks for your response.
The actual issue seems to be different for me.
I have added a new multi raw field, which is not analyzed. Now I am aggregating on productSkuDynamicAttributeValues.raw field
But still there are still some problems.
Below are my new mapping on the field.
"productSkuDynamicAttributeValues": {
"type": "string",
"fields": {
"high_precision": {
"type": "string",
"analyzer": "high_precision_analyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
},
"word_proximity": {
"type": "string",
"analyzer": "word_proximity_analyzer"
}
},
"analyzer": "high_recall_analyzer"
},
I have a value "0-6 months", but in aggregation I am getting "06"... !!!!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.