Hello,
I'm developing a project where I'm using elasticsearch to index files (pdf, doc, txt...). I have to index the content of this files, and they're written in different languages.
My concern is about the stopwords filter, I've used it before, but with one language index, so I didn't have any problems but now I'm going to have from 7 to 12 languages (English, French, Spanish, Galician, German...) I made some research and didn't find any relevant information.
My questions are:
- There is any kind of stopwords filter for multi-language purpose ?
- Should I use several stopwords filter ? (Doesn't seem optimal to me)
- Should I have 1 index for each language so each index has different mapping ?
- Maybe mixing the stopwords filter for all the languages I need to index ?
I hope someone has faced this issue before and can point me to a succesfull solution.
Thanks in advance