I would like to ask for help and in particular which direction to start to dig. What I want to achieve is to modify fuzzy query behavior this way. Say, I have set of candidate tokens for error correction and my goal is to give more "weight" to candidates which contains changes in vowels. An example:
Lets say we search for "baban"
The candidates with distance might be:
"bobon" <- this should have higher score.
Probably I need to add some information to token payload not only about the number of mismatches, but also about number of vowels/consonants changed.
In more general form:
I do not want to rely only on the TF/IDF statistics in such query, but also on some linguistic information: like vowel/consonant substitution.
I am quite new to the Elastic and I wanted to ask help which token filter I need to modify(if there exist any token filter for fuzzy queries).
Thank you in advance for help.