I use es cluster to serve a small index (<1G) for searching and aggregation and so only 1 shard is assigned and 1 replica. As 1 node does not able to serve all the request fast enough so the cluster scale to 10 nodes.
I increase the number of replica to 9 so that the total replica is (1+9) so that every node have the copy of the single shard.
I put a load balancer in front of all the nodes so that the http request is distributed across the nodes.
I would like to ask the following:
Is it a general good practice for my case?
Is it a good practice to increase the number of replica so that every node have a local copy of the shard in order to increase the performance?
When the node receive a request, does it prefer to use the shard available in local before reaching to other nodes? should i use
preference=_localto force the node to uses its local shard from the documentation?
 Search shard routing | Elasticsearch Guide [8.6] | Elastic