I have been running Elasticsearch on AWS for some time now, running on 2 x r3.xlarge instances with a t2.small as a tie breaker and have had no issues however the cost of the stack is high and I am looking for ways to optimise and save money. With the recent move to doc values as defaults and less emphasis on memory I have been experimenting with smaller instances and EBS storage rather than expensive instance stores.
I first tried with 2 x t2.large each with an 80gb gp2 (default SSD) EBS but after 45 minutes or so of running a large test batch disk latency starts to go crazy and requests start timing out. I then tried with 2 x m4.large using EBS optimised volumes and exactly the same behaviour!
Any advice as to what AWS limits I might be hitting and what possible next steps could be would be hugely appreciated. Looking at the cost of provisioned iops I would be better off reverting to the r3s with instance storage.