Search content in English and resultset returns in other languages

Antonio_Teixeira · December 18, 2020, 8:15am

Hello All

Im trying to find the best way to receive in the resultset all the results that being made in a language plus the results in another language.
For example if i search for black, i want that Elasticsearch retrieves all the matches with black, and all the matches with black in another language, like German, Italian, Spanish or other that could be in the data.
My question is, what is the best option to do this?

vincenbr · December 18, 2020, 10:23am

Hi Antonio,
You are touching a vast subjet which is cross-language search (or Cross Language Information Retrieval in academic speech). I think the hardest part is managing translation, which is beyond elasticsearch's scope.
As a first glance, you could have 2 approaches:

you have n indices. Each index has its language and assiciated mapping/analyzers. You translate your input queries in n languages and you launch n queries. you get n result lists and you present a tabbed resut page (1 tab per language)
you sitll have n indices (or one combined index, does not make functional difference). At index time, in a pipeline for instance, you extract significant text of your non-english docs, and send it to a translation service. you put the result in an "english_text" field you've added in your mapping. At search time, you run only one query and you are able to display a combined result list which your users can sort or filter as they please.

Of course there are other approaches (semantic vector based...) and this greatly depends on your constraints, volumes, requirements...

Antonio_Teixeira · December 21, 2020, 10:26am

Thank you for your answer.
About the volume, i think i will not have an huge amount of data, so im thinking the best way to proceed.

system · January 18, 2021, 10:26am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Handling multiple languages Elasticsearch	1	300	July 6, 2017
Multi-language analyzers in Elastic Search Elasticsearch	3	1115	August 17, 2017
Indexing for multi-language support Elasticsearch	5	2993	July 5, 2017
Implementation of multi lingual search Elasticsearch	3	349	July 6, 2017
Multilingual field handling with multiple fields in ES Elasticsearch	4	1882	July 6, 2017

Search content in English and resultset returns in other languages

Related topics