I created a custom analyzer that does lowercasing. The reason why I did not use a normalizer is because I need to apply stopword filter that the normalizer is not supporting.
"lowercase_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"truncate",
"stopwords",
"lowercase"
]
},
I define a subfield with the custom analyzer as follow:
"normalized": {
"analyzer": "lowercase_analyzer",
"type": "text",
"fielddata": true
},
But I have my doubts about using fielddata. Ideally, I want to use the doc_values for the aggregations. Since this is a fieldtype of text
, this is not possible. How much will adding fielddata
hurt the stability and performance? Does anyone know another solution?