Hunspell analyzer


(Nicolas Buisson) #1

Hey guys,

I would like to configure analyzers for languages which are not supported by ES out of the box: ET, HR, LT, MT, PL, SK, and SL.

So I took a look at the Hunspell Token Filter. As far as I know, the Hunspell Token Filter is only for steeming, but when I take a look at already configured languages, there is not only stemmer, but also stop words, lowercase, other even keywords.
So how analyzers based on hunspell should be correctly configured?
What about this github project https://github.com/elastic/hunspell/tree/master/dicts ?

Thanks.


(Mark Walkom) #2

If you want to look at non-english languages then also check out the ICU analyser.


(Nicolas Buisson) #3

It's seems the ICU plugin is not used for steeming not for normalizing, no?


(system) #4