Multiple languages documents

Carlos_D · July 18, 2022, 4:56pm

if I were to only use 1 universal engine, I would have to duplicate each document per each translation it has, is that correct ?

That is one way of doing it; another would be to create different fields in each of the document for each translation (that is, content_en, content_es, content_de, etc) and avoid duplication. That way, duplicate results will not be possible when searching for content that is the same in different languages (like location names, part numbers, or any other language-agnostic content)

We tend to recommend the duplication and specific engine approach for multi-language documents, as it is the most flexible and tuneable of the two.

maybe creating all engines beforehand and configuring tunning only on the meta engine would alliviate this pain ?

do we need to put all the non supported documents and/or translations in a universal engine that would also be part of the meta-engine ?

Definitely! You could create separate engines beforehand for the language-optimised ones, and have another "catch-all" engine that uses universal language where docs for other languages are stored.

Keep in mind that Meta Engines cannot be individually tuned; result settings, curations and relevance tuning apply to all its source engines.

possibly less performant ? (Instead of just querying 1 engine ?)

Elasticsearch can deal with that, it should not involve a heavy performance penalty.

In the end, the best solution to be used will depend on your content and use cases. What I would suggest is to take a look and do some proofs of concept with the following approaches:

A single, universal engine with duplicate documents (one doc for each language)
A single, universal engine without duplicate documents (each doc will have translations in different fields)
Multiple, language-specific engines with documents specific to the language (and another, default engine for non language optimized documents).

Topic		Replies	Views
Best way to index multiple languages Elasticsearch	9	10300	July 6, 2017
Implementation of multi lingual search Elasticsearch	3	372	July 6, 2017
Multilingual elastisearch most relevant language Elasticsearch	1	800	July 5, 2017
Mult-language searchable in one field Elasticsearch	9	450	July 6, 2017
Best practices for indexing documents with alternate names in many languages Elasticsearch	1	317	July 6, 2017

Multiple languages documents

Related topics