Hello everyone!
I have a German word with umlaut, lets say it is "läuft". My target is to create an analyzer that produces three tokens at the end: "läuft", "laeuft" and "lauft".
I have tried different combinations with icu_normalizer, asciifolding and snowball for German2 filters but no results. The best result I've got from asciifolding token filter that emits two out of three required tokens: "läuft" and "lauft".
So, basically, I need to create some kind of custom asciifolding filter for German language that will allow to emit additional variations for words with umlauts.
My configuration for asciifolding and snowball filters are the following:
"ascii2": {
"type": "asciifolding",
"preserve_original": "true"
},
"snow-german2": {
"type": "snowball",
"language": "German2"
},
I would be really appreciated for your help!