I'm not sure I'm understanding you. Can more than 1 thread read the same shard simultaneously? (I don't know ES's locking mechanisms)
Regardless, are you saying to assign less than 1 shard per query thread?
My setup: multiple shards with virtually identical terms dictionaries, heavy concurrent queries, queries are routable to 1 shard.
If we had 1 node with the 8 vCPUs max you indicated in another post, the query thread pool would be 13 × 1. What shard number would you suggest given the data I provided?
I was going with 52, which allows me to start with a 6MiB RAM server and scale to 4 nodes before needing to reindex.