We're currently running an elastic cluster for an ELK backend, on i3.2xlarge EC2 instances in AWS and using the local Instance Store to store the data partition on, which is generally what is recommended in the official elastic documentation (https://www.elastic.co/guide/en/elasticsearch/plugins/master/cloud-aws-best-practices.html).
However, we're reaching a point in log volume where we're close to using up the 2TB instance store disks, which obviously can't be increased - and we're already using curator to archive off old logs etc so need to increase storage.
Rather than scaling instances horizontally we're investigating the possibility of using EBS volumes for the data store instead, which can obviously be scaled larger and (to an extent) have scalable performance, but are unlikely to provide "local SSD" performance of Instance Store drives. They can however be detached from a failed (e.g. hardware failure) instance and attached to a healthy one, whereas obviously instance store data is ephemeral.
Has anyone done this successfully, or got any data or benchmarks? For reference, our PROD ELK cluster has a log volume of ~150GB a day. The official elasticsearch documents get a bit vague on EBS volumes over Instance Stores, saying that it should be for "smaller" clusters and to "make sure you have enough IOPS".
I'd love to know if anyone has any experience with this. Do EBS volumes just get prohibitively expensive to get the IOPS required for a "larger" cluster?
Thanks for any help/thoughts/experience!