Hi ES Team,
We are building a real-time search system for one of our applications. Our customer base is huge and so does the data, we have data for around 13TB current day and it is intended to grow. The customer can search in any language. As of now, we are providing the option to search on 8 languages and the number of languages would not grow frequently.
Given this information, we are having an index with many fields and two of the fields are language dependent and those fields have to be searchable on all languages. Currently all documents are already indexed in english. To enable the multiple languages on the two fields, we have two options:
- Use one language per field. Problem here, re-indexing the existing data(in the huge index) could be cumbersome if the existing index size grows in future.
- Have two indexes. One of the index would only be used for mere translation purpose to get the translated word in english (by searching with a foreign word). With this english word, we again search in the existing index to get the customer data in english.
Which approach could be better and simpler given the fact that index size is intended to grow in future. Any help would be appreciated!