How to create many small separate indices

Hi!

I want to create many (~10k) indices each having a couple of hundred documents. I have read in other places that that's discouraged as it creates a shard per index which is a lot of overhead. However, I have the requirement that the indexing of the documents is separate. I.e. I do not want the text statistics of a document belonging to index A influences those in index B. Any ideas on how to solve this?

I was thinking to have a few indices, but with many mappings where each mapping would hold only the documents that should be indexed together. Would that work?

You can have one index with multiple groups of fields to organize the mapping
Example

group field
doc_type_1 doc_type1_field1
doc_type1_field2
doc_type1_field3
doc_type1_field4
doc_type1_field5
doc_type1_field6
doc_type1_field7
doc_type_2 doc_type2_field1
doc_type2_field2
doc_type2_field3
doc_type2_field4
doc_type2_field5
doc_type2_field6
doc_type2_field7
2 Likes

Is it also possible to solve it with aliases? Ie have many aliases that point to the same index? Would that also respect the "separate indexing"? That is at least my understanding what was suggested in the form here

many alias is not many index. not that many overhead compare to 10k index.

one node in system has limit of 1000 active shard limit. you can have 2000 if you want but then node will behave differently can't tell how.

one index has one mapping can't have multiple mapping.
mapping is basically telling you what kind of field type you have in that index, text,keyword,int,bool etc..

if you post some example record then it will give more idea on how to organize. Now a days there are many different way same thing can be done.

I have actually just tried creating a single index with many aliases and a filter per alias. However, that does not seem to respect analyzing the documents "per alias" separately. At least when I do a search with /_explain, it tells me some terms were found in 1000s of documents in the index (but for that alias, there are only ~100 documents).

Aliases can only offer filtering and does as you correctly pointed out not affect relevancy. Having lots of aliases could in inself be an issue anyway. I suspect grouping data as Yassine suggested might be a good compromise.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.