Search performance improovment by adding replicas shards

Loris · November 20, 2019, 3:34pm

Hi,
Elasticsearch documentation reports that increasing the number of replicas shards will lead to search and read performance improvements. Can you, please, explain it better? According to the following scenarios, I have some doubts about how performance should be improved in scenario 2 compared to scenario 1.

scenario 1:
HOST 1 P0 R1
HOST 2 P1 R2
HOST 3 P2 R0

scenario 2:
HOST 1 P0 R1 R2
HOST 2 P1 R2 R0
HOST 3 P2 R0 R1

Thanks, Loris

spinscale · November 21, 2019, 2:59pm

Hey Loris,

so this is not a yes or no answer, but let me try to elaborate. In your above case I guess it will not speed up things significantly, as all of your nodes are probably already answering search queries - thus it is a question of existing load plus the ability in this case, that all of the data is locally available, so you could prevent network roundtrips. You could also probably just use an index with 1 shard and two replicas, so that all nodes hold the data in a single shard, speeding up queries - if that is possible.

So, what is meant with the explanation of adding more shards to have faster reads? Imagine having one node with one shard. This node is able to sustain 5000 queries per second. Now your task is to scale the system to 25000 queries per second. Then adding five more nodes to the cluster and configuring the number of replicas to 5 would mean that each node had a copy of that shard, totaling 6 shards. In this setup your cluster would be able to sustain 30k queries per second (a little headroom might be a good idea).

Hope this explanation makes a bit more sense. Using replicas to scale reads means the ability to be able to answer more search queries in parallel.

Hope this helps!

--Alex

Loris · November 21, 2019, 3:21pm

Hi Alex
thank you for your answer, but I'm still confused. Consindering your example: 1 index (1 primary, 5 replica), only 1 shard on 6 (total) should be queried. What I assume is that, considering Adaptive Replica Selection technology, in a particular environment (with more than at leat 3 nodes) and a lot of indices, adding replicas can be helpful in order to spread query load on chosen replicas according to their performance and general load. Is this the explaination?
Thanks!

Loris

spinscale · November 21, 2019, 3:23pm

indeed, you only need to query one shard. But instead of this shard being available only once in your cluster, it is available 6 times, resulting in the ability to fire off more queries. Adaptive replica selection was not part of the equation here, just the sheer number of shards allow to increase read scalability compared to only have one shard in your cluster.

system · December 19, 2019, 3:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improve search performance beyond 2x Elasticsearch	4	268	March 19, 2023
Does adding multiple shard replicas increase performance? Elasticsearch	5	2297	March 28, 2017
Scaling ES for search Elasticsearch	4	392	June 18, 2019
Getting worse search performance with a replica shard Elasticsearch	9	2195	July 5, 2017
Will increasing the no. shards in a large cluster affect the query performance? Elasticsearch	7	2936	July 5, 2017

Search performance improovment by adding replicas shards

Related topics