We have a cluster holding ~400gbs of data. We have divided this using 6 data nodes(2 nodes X 3 zones). Currently, in Elastic Cloud, all these are I3 machines. We want to understand what would be better in ECK - I3 nodes using local storage or another node such as R5/M5 using EBS volumes. It seems I3 might be better keeping memory in mind, but then EBS volumes would be faster to restore data in case of node failure. Please suggest.
I think it is really difficult to make suggestion here as it depends on many factors particular to your use case.
As you have already pointed out in your question EBS volumes give you more operational flexibility and around node failure/node replacement. It will also be easier to reschedule pods after upgrades when your are not bound to a particular k8s node as you would with local storage.
We have done some benchmark testing against EBS volumes with ECK and think I/O performance will be sufficient for many use cases. But there is no way around you having to benchmark your application against this hardware setup to make sure you get good enough performance out of the EBS volumes for your specific use case.
Finally cost might also be a factor playing into this decision as EBS volumes will incur additional cost you won't have with using instance local storage.