We started seeing some high latency with the applications querying elasticsearch(7.17.0) and found that search_fetch_time is significantly increasing whenever there is some significant increase in incoming search traffic.
When analysing the search_fetch_time metrics further, it was seen that some specific indices have the highest fetch time and they were configured 1primary:1replica.
Can any one suggest how to improve search_fetch_time?
Would increasing replicas improve the situation as it seems to help with parallel searching?
Are there any tradeoffs, other than the obvious disk space requirement, when increasing shards/replicas for any index?
There could be quite a few things to consider in this situation, but here are some general thoughts.
search_fetch_time mostly has to do with fetching the docs from the shards/segments after they've been identified - so things to consider would be disk speeds, result set sizes, document sizes, and document enrichment. All of these (plus some other compounding factors) could be looked at the help make that speed faster. I would generally start with Disk performance and contention.
Increasing replicas may help improve your fetch speeds if there is contention with the nodes that have the current primary/replicas. This would allow for more copies of the data on the more nodes that the search can pull from to return to the coordinator.
anytime you adjust your sharding and/or replication strategy, you should take your ingest vs search throughput into consideration. in this scenario, if you have a single primary shard and one replica to start, then add four more replicas, that's potentially much more overhead for the ingestion of new data as it has to be replicated out.