Improve search performance beyond 2x

pathaniaamn · February 15, 2023, 4:44am

Question on replica shard:

Node1 P0
Node2 P1
Node3 R0
Node4 R1

So, overall search performance can get 2x as Primary and Replica shards will share the search load. But what is next if search load increases to 5 fold or even 10. How is that handled?

I don't think you would like to have a 4 or 6 copies of replica shards, that does not look smart.

Christian_Dahlqvist · February 15, 2023, 6:44am

Search performance is generally a combination of the search latency achieved and the number of concurrent queries the cluster can support. The latency will depend on e.g. the size of the shards, the type of data and queries used as well as resource constraints. As the number of concurrent queries will increase resource usage, these are linked.

If you have a node with a primary shard (or two nodes each with a primary shard as in your scenario) you can run a benchmark with a single concurrent query and measure the latency the node(s) can support for different types of queries. Once you have established a latency threshold you are willing to accept, you can gradually increase the number of concurrent queries until you see that the latency the cluster can support increases beyond the target. Lets assume you are able to serve X concurrent queries within the expected latency. If you continue to increase the number of concurrent queries, work will be queued up and latency increase further, but the cluster can still continue to serve queries. At some point you will start to see queries getting rejected as they can no longer be queued up.

In order to serve more concurrent queries you can then either scale up the node by adding resources, which alters the capacity of the node and changes the calculation above, or scale out by adding more nodes and start using replica shards. If you add one replica shard on another node (or 2 nodes in your scenario) the latency would not be affected but the cluster should be able to serve close to 2X concurrent queries.

If you want to serve more concurrent queries at the target latency you can continue to scale out or improve the query throughput a single node can handle, e.g. by ensuring the nodes page cache is large enough to hold the full data set, thereby eliminating/reducing disk I/O as a resource constraint.

pathaniaamn · February 19, 2023, 1:26am

Thanks for the input. I appreciate that.

So, are you suggesting 3-4 replica nodes for the same primary node after the latency is not expected?

To think on broader scale, would search platforms like google and bing would follow such strategy or there are other smarter ways to handle this than redundant data copies (replica nodes) or vertical scaling which i feel is usually not the answer when you know that search hits would increase for sure.

Illustration:

Node1 P0
Node2 P1
Node3 R0-01
Node7 R0-02
Node8 R0-03
Node4 R1-01
Node4 R1-02
Node4 R1-03

Christian_Dahlqvist · February 19, 2023, 10:15am

Commercial search engines like Google and Bing are large, complex and highly optimised systems, and I do not think you can compare that to how you scale Elasticsearch.

There is a lot that go into optimizing search performance in Elasticsearch and the example you covered only covers one aspect of this. I would recommend having a look at this section in the docs. To get the best performance from Elasticsearch I would recommend looking into the following:

Optimise how data is stored. This includes document size and structure as well as mappings used and shard size.
Optmise how you query data. This includes writing as efficient queries as possible as well as targeting data efficiently. This may require restructuring the data to better support certain types of queries.
Ensure that cluster resources are used as efficiently as possible. Identify and address bottlenecks. Does it make more sense to scale up or out?

system · March 19, 2023, 10:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search performance improovment by adding replicas shards Elasticsearch	4	1451	December 19, 2019
Scaling ES for search Elasticsearch	4	392	June 18, 2019
Does adding multiple shard replicas increase performance? Elasticsearch	5	2297	March 28, 2017
Getting worse search performance with a replica shard Elasticsearch	9	2195	July 5, 2017
Distributed search, how to work? Elasticsearch	6	1771	April 26, 2019

Improve search performance beyond 2x

Related topics