I am using GKE and we'd like to build several Elasticsearch Clusters using ECK. Some of them will be rather big (10+ TB) and have a decent throughput (1+ TB per day), others are smaller.
We are wondering whether it's a good idea to use local persistent disk (instead of persistent volumes) as they offer way better performance characteristics at a cheaper price ($0.08 instead of $0.17 per SSD GB month). I am fully aware that the local persistent disk has a few downsides such as:
- It is bound to a single Kubernetes node
- It can not be resized
- Data loss is possible as it's not redundant as opposed to persistent volumes
- You must use a multiple of 375GB partitions and the performance scales with the size/number of the disk/partitions as well
However even after considering these downsides I think it could be a good idea to use them (maybe only for our hot nodes?). The resizing problem can be fixed by adding or removing ES nodes, Data loss risk can be reduced by using Elasticsearch replicas which also improves the read performance as replicas will be queried.
My only concern is whether ECK supports that? Whenever a node is removed the data must be transferred to a different node or it will be lost. As a node dies the disk is gone as well right? Are there any other concerns/issues which I should be aware of?
I am surprised the documentation barely discusses this topic given the potential performance impact.