From experience I want sensitivity on subwords. By adding an extensive wordlist I find the subwords but I get undesired behavior. If the subwords are frequent or appear in two fields they are higher scored than the full word. One solution would be to boost on the tokens length, probably by the square of the length. Example
I have Swedish texts. The word "fastland" (means mainland) gets analysed/tokenized to "fastland", "fast", "land". A text with the word "Finland" appearing 5 times gets higher score than a text with the word "fastland" appearing once.
What Java-classes shall I subclass to make a search time boost on an individual token?
Is a bit of scripting in Painless the right way to go, i.e. does ctx or some parameter contain the hitting tokens?