If I understand this correctly, you have a single index in the cluster with 2 replicas (15 shards in total) across 11 data nodes. This means that each node only have 1 or 2 shards. As your data set is reasonably small, have you tried to increase the replica count further to spread out load better?
Can you also provide the output of the cluster stats API.
How frequently are you updating or indexing data?