Does ECK support local persistent disks and is it a good idea?

Hello,
I am using GKE and we'd like to build several Elasticsearch Clusters using ECK. Some of them will be rather big (10+ TB) and have a decent throughput (1+ TB per day), others are smaller.

We are wondering whether it's a good idea to use local persistent disk (instead of persistent volumes) as they offer way better performance characteristics at a cheaper price ($0.08 instead of $0.17 per SSD GB month). I am fully aware that the local persistent disk has a few downsides such as:

  • It is bound to a single Kubernetes node
  • It can not be resized
  • Data loss is possible as it's not redundant as opposed to persistent volumes
  • You must use a multiple of 375GB partitions and the performance scales with the size/number of the disk/partitions as well

However even after considering these downsides I think it could be a good idea to use them (maybe only for our hot nodes?). The resizing problem can be fixed by adding or removing ES nodes, Data loss risk can be reduced by using Elasticsearch replicas which also improves the read performance as replicas will be queried.

My only concern is whether ECK supports that? Whenever a node is removed the data must be transferred to a different node or it will be lost. As a node dies the disk is gone as well right? Are there any other concerns/issues which I should be aware of?

I am surprised the documentation barely discusses this topic given the potential performance impact.

ECK is agnostic to the type of storage used -- so if you can operate it with stateful sets, it should work with ECK. Performance wise the storage recommendations are no different than Elasticsearch in general; they are not different on Kubernetes.

Operating local volumes on Kubernetes is still a bit of a challenge, but one that is not specific to ECK and applies to all stateful workloads. Off the top of my head, this issue is still present and must be worked around (with a double delete, normally).

As a node dies the disk is gone as well right?

Yes. The corresponding Pod will stay Pending as it cannot be scheduled on any other Node with the same local PersistentVolume. At this point you can either:

  • attempt to recover the host and its data
  • manually delete the PersistentVolumeClaim and Pod, so a new Pod gets scheduled with an empty volume. This leads to data loss if you do not have replicas of the data on other Elasticsearch nodes.

Thanks for the quick response,
so a common operation such as a Rolling Node Upgrade (let's say because of a Kubernetes upgrade) would already cause a problem which the ECK can not handle.

I'd say it could be handled in a more graceful way by transfering the data from the leaving node to the new node, but I am not sure whether this is something the ECK could even be responsible for.

As far as I can an automatic data migration is currently not possible in Kubernetes with ECK because the pod won't be scheduled (because of the missing PV). Thus I don't see how one currently could use local storages in Kubernetes or am I missing something? This would be a pitty because local disk IOPS seem to be 10x higher for the same disk size of a persistent volume offered by Google.