Nodes reporting big difference in average took_millis on same data

Hi! I do a load test with 4 shards index ,1 shard per node. After the test I take the average of took_millis from slow query logs (I log queries that take longer than 1ms, so all queries) per per node and see something like this:
node1: 800ms
node2: 200 ms
node3: 200 ms
node4: 400 ms

I then swap the shards of node1 and node2 and the shards of node3 and node4, run the test again with same queries, also on a different day, and still see that node1's times are 4 times higher than node2 and node3. What could be the reason for this? The nodes are VMs so I suspect that CPUs perform differently and we are investigating this. Is this a reasonable thing to suspect and what else could I do to get equal performance? Thanks for your suggestions!