You can test with the query [from these docs](You can test with this query, which if you send to your first node you can see which shards and nodes it'll use: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-shards.html), which if you send to your first node you can see which shards and nodes it'll use.
Generally unless you target a document ID/shard or other ways to route to a single shard, the coordinating node (the one your client talks to) must do a scatter/gather operation and send the query to all of an index's shards
The question is how it chooses, in your case among the 4 copies of each shard - this is called Adaptive Replica Selection with this nice doc on that.
My reading of that is that among other things it uses past performance as an input, so in theory of all queries were local and fast, it might pick the local shards and not bother with other shards.
But the queue matters, also so if the local node is busy, it may send to other nodes.