Here is another useful tip when facing these kinds of problems.
For each shard that you have in an index elasticsearch spawns a thread to search in that shard.
That means that if i have a 4 VCPUs in a node i can only search 4 shards in parallel.
See here
So if you have a 5 primary shard monster index with much more VCPUs (16-32) you aren't actually using those VCPUs since no more than 5 threads will be spawned to search the shards of that index.
Increasing the number of primary shards will better utilize those VCPUs since for each shard elasticsearch will spawn a thread and will be searched in parallel.
So that is how more VCPUs help.
How faster CPU helps is that the searches in those shards will be faster.
So if a 2.0GHz CPU searches for 800ms in a shard a 3.0GHz CPU will be much faster at searching in that shard (~ 200-300 ms).
How more RAM seems to help is that with every new index and every new shards that is created, elasticsearch takes some RAM space for every new index and new shard (how much space they take i don't really know).
This guide on shard size helped me to better manage CPU and RAM requirenments.