We have an elasticsearch cluster with 1 small index and high read to write volumes. It has 1 primary and several replicas to handle the read loads. We turned on shard request cache for searches (with size>0 hits). As expected, the cache gets completely cleared for all these replicas and the primary when we update the index. This means our cluster suddenly cannot depend on the cache for several minutes until the cache is warmed again. This might mean our cluster has high latencies for a while.
What strategies can help prevent this from happening?
We don't have strong write consistency requirements. Things only need to get updated within 10 minutes.