Best way to handle multiple fields with same text

dbakti7 · May 18, 2020, 8:32am

Hi, all!

We are currently trying to support search on multiple languages, but we are not indexing in all of these language fields (hence we are not using multifields).
We perform language detection to figure out which language fields should be used (can be more than one).
The problem now is, when we are specifying the text in multiple fields, the http payload size increases very quickly (complete duplicate per language field).

What might be the better way to handle this situation (especially to avoid explosion of http request size)?
Is it possible to achieve this via painless script?

Thank you!

dadoonet · May 18, 2020, 11:04am

The guide (book), although being old and based on version 1.x is still accurate to me. Have a look at https://www.elastic.co/guide/en/elasticsearch/guide/current/language-pitfalls.html

Basically if you don't want to have the same content in multiple languages within the same document, you can index one document per lang as described here: https://www.elastic.co/guide/en/elasticsearch/guide/current/one-lang-docs.html

system · June 15, 2020, 11:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
One Language per field vs. multi-fields for large number of supported languages Elasticsearch	1	737	July 5, 2017
Handling multiple languages Elasticsearch	1	303	July 6, 2017
Multi language index, documents performance? Elasticsearch	2	574	July 5, 2017
Multilingual field handling with multiple fields in ES Elasticsearch	4	1900	July 6, 2017
Multi text search in same field Elasticsearch	2	646	February 2, 2017

Best way to handle multiple fields with same text

Related topics