We have separate nodes for each role: 3 master nodes, 2 client nodes and 8 data nodes.
Each node is EC2 instance in AWS.
Nodes are equally distributed among two availability zones (for high availability purpose).
Recently we notice that we are being charged by Amazon on DataTransfer Regional (Between AZs) - mainly in data and client nodes.
Our applicative servers queries the client nodes via load balancer, aka round robin between clients (thus, between AZs). The latter querying each one of the data nodes (I guess), no matter in which AZ, right?
All indices has replication factor 6 (means that their shards appears on 7 of 8 data nodes).
My goal is to minimize this charge.
Can I enforce the client node to query only the data nodes in its AZ? If so, how?
Any other solution / idea?