Many small indices vs One large index

I know this question has been asked many times, but I needed more clarity on this topic. Let's say I have 5000 customer data which has the same number of fields in each document but is independent of each other, i.e. I will never be needing to query multiple customers together. Each customer will have 2 indices and that will make 10000 indices in total. I can combine all of them into 2 big indices with some tens of shards or keep them all in separate indices.
I would also like to know about how many recommended nodes in each case would I need, considering that there should be no more than a thousand shards per node means that in many indices configuration I would at least need 10 nodes and in the huge index configuration I can keep 1 node but add if necessary.
I was wondering if I could get some suggestions on what would be the best configuration. An early response will be appreciated.

Lots of small indices and shards is very inefficient, so it is almost always better to consolidate.

Thanks for the early response. Could you please help me in the second part of the question as well?

In order to have a highly available cluster you need at least 3 nodes. As you do not query across users I would recommend looking at using routing when indexing and querying. How many shards you need depend on the data volume and work load.

Thanks routing seems the way to go instead of filtering based on user ID. So my load for querying will be a few 100 qps at max load, for that I think 10 shards should be enough.

Sounds like a reasonable starting point.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.