If you have capacity, is 1 index with 5 shards better than 5 indices with 1 shard each?

bruce289 · April 2, 2024, 8:21pm

Let’s say the cluster is empty and we have the choice between:

1 index with 5 50 GB shards
5 indices, each with 1 50 GB shard

Is there any difference between these 2 options purely from a performance standpoint? Assuming that we will index into and search across all shards concurrently in both cases.

I think I'll benchmark this anyway but I'd just like to know from a theoretical perspective, what would be the cause of any potential performance differences here?

dadoonet · April 2, 2024, 8:36pm

Not really a difference. I'm not sure you will notice it.
But what kind of use case is it?

elasticforme · April 2, 2024, 8:50pm

if you have five machine five shard will be on seperate systems
same way your index will be in five different systems.

but when you read a data all first loaded data will be in one machine and all last loaded data in last machine (last index)

this means you have hotspot.

but if you have five shard it is random and hence you hitting different systems. ( this is logic if data is being writting on different shard at a time) just like raid5 on storage.

dadoonet · April 2, 2024, 9:05pm

I don't think so.

If you query 5 index with 1 shard or one index with 5 shards, that should behave the same way IMHO.

bruce289 · April 3, 2024, 5:50pm

Thanks for the responses. Yeah, my assumption was as you say, David, that there shouldn't be any performance impact but I'll test it out.

So I largely just wanted to understand the impact of moving from daily indices to ILM and how many shards I should be assigning to indices when we do so.

We currently have some daily indices which we configure with 20-50 pri, 1 rep, which results in approximately 50gb shards once the day is done. From previous experience, we’re pretty sure that by no longer indexing into all of these shards at a time, we should decrease the stress that indexing puts on the cluster. And if we go with 1 shard for the write index and roll over at 50gb, then we’ll be in a similar place to my original question:

Before: 1 index, 50 shards
After: 50 indices, 1 shard each

I realise that I originally asked about indexing into all shards/indices concurrently, which won’t actually be the case, but I was just curious about that scenario too. We will, however, be searching concurrently.

We’ll also need to check whether indexing into 1 shard gives us sufficient throughput. Although, even if it does, I’m guessing we could end up in a situation where some of the busier indices are all located on the same node, which could be troublesome.

Christian_Dahlqvist · April 3, 2024, 6:01pm

Switching to rollover with a smaller number of primary shards should be positive for indexing speed as bulk requests would be less spread out. As the older indices will no longer be written to, they can be optimised (e.g. forcemerge) earlier, which may also help with resource usage and search speed.

If you have index patterns with varying size you may want to set larger ones to have multiple primary shards in order to better spread out the write load. If you have indices with multiple primary shards you can force these to be spread out across different nodes using the index.routing.allocation.total_shards_per_node setting. Be careful with this as it may prevent shards from being allocated on node failures if you are too aggressive.

bruce289 · April 3, 2024, 6:19pm

Thanks @Christian_Dahlqvist . Yeah, we use that setting currently with our daily indices. So we have about 5-10 index patterns with tens of shards, then about 100 or so with 1 or 2 shards. I'm wondering if it's worth doing some allocation include/exclude settings on the larger index patterns to prevent a bunch of the write indices all being allocated on the same node.

system · May 1, 2024, 6:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
When do you need more then 1 shard? Elasticsearch	12	1853	July 6, 2017
Performance searching single index vs multiple indices Elasticsearch	9	18132	July 27, 2018
Better to have more indices or more shards per index? Elasticsearch	3	256	August 15, 2022
Sharding and Performance Elasticsearch	1	310	August 29, 2018
Filtered Aliases VS Separate indexes Elasticsearch	12	1416	July 15, 2017

If you have capacity, is 1 index with 5 shards better than 5 indices with 1 shard each?

Related topics