Stemming on multiple language index


(venkata nagaraju buddarapu) #1

hi ,

My elastic index contains more than one language and a document in elastic index contains more than one language

For example :my elastic index has Chinese ,Japanese and Korean language documents .

Chinese documents contains letters in English

my understanding from stemming documentation is that each language has it own recommended stemmer

How do I configure my analyzer for indexing and searching with multiple stemmer filters ,don't they interfere with each other

Regards,
Nagaraju


(Mark Walkom) #2

You can configure stemming (analysis) per field but that may get messy.
It might be worth considering breaking the docs out into their own language specific index and then applying the stemmer to all docs in it.


(system) #3