Difference between one big and many small indices

Hello, what would be the main difference (performance and resource usage) between 1 x 100 GB index compared to 100 x 1 GB indices? 4 shards, no replicas.

one important metric is the number of shards that gets searched across. In one example a search would go across 4 shards of one index, in the other example a search would need to search 100 * 4 shards, so 400 shards, which requires more work getting sorted search results. Also a single index would likely be able to end up as an smaller index in total size than many smaller ones.

OTOH having data on many shards allows you to scale out easier. With the above set up of 4 shards the maximum size of a cluster that makes sense would be four, because you could not divide the data further (think of shards as unit of scale).

Also, if you have many nodes, than you might not be able to utilize more nodes when querying or indexing if you have fewer shards.

As you can see, there is no one true answer, but a general rule of thumb would be to have as few shards as possible, but still have the possibility to scale.

Hope this helps!

--Alex

I see. So having hundreds of small indices just for the sake of it does not make too much sense and could possibly require more resources for search queries (Heap Wise) than just 1 index with 4 shards on 4 nodes , 1 shard each? I just like to know if that is the likely outcome because our RAM resources are very limited :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.