Handling multiple languages

(Rafał Kuć) #1


One of the projects I'm working on requires to handle data that will
be written in multiple languages and the language can be one of even
the least common one. One think I was thinking about is define some
multi fields for most common languages and threat other languages as
English or use analyzer without a stemmer. Of course, the language
identification is simpler during indexing and more complicated during
query time because of shortness of the query.

What I was wondering is not using multi fields but multiple mappings
in a single index, each mapping for different language.

Is one of those approach better than the other, or maybe You have some
better ideas how multiple languages can be handled in Elasticsearch ?

Thanks in advance,

(system) #2