Can Elasticsearch on Kubernetes safely reuse data from PVs that were detached after a node went down?

I'm looking into different ways to deploy Elasticsearch on Kubernetes.

My understanding so far is that it's normal to use the ECK operator pattern to deploy if you're deploying to Kubernetes, and that this causes StatefulSets to be used. Each pod for a data node in the StatefulSet would use a persistent volume claim (PVCs) to get a disk to use. When a data node's pod goes down, a new pod would be brought back up by Kubernetes that would re-attach the disk, due to the pod's PVC.

Meanwhile, according to my understanding of Elasticsearch rebalancing, while the pod is down, the cluster enters a yellow state and shards are rebalanced, meaning data for shards not running the desired level of replica shards is copied to other data nodes to get the running replica shard to the desired number.

So when the data node eventually comes back online, and the disk is automatically re-attached, won't Elasticsearch then be in a state where the re-attached disk has an extra copy of the data that was previously copied onto the other remaining running data nodes? If this is the case, I'm wondering how the Elasticsearch cluster would sort itself out:

  • What does it do with this extra copy? Does it delete one of the copies or keep it around forever?
  • If deleting a copy, which one does it choose to delete?
  • If the cluster stayed in a yellow state the entire time the new data node was coming online because Kubernetes was able to bring a new data node online quicker than the rebalance could finish (this is likely in my opinion if I use a platform like GKE which can provision Kubernetes nodes quickly), then won't there be an incomplete/corrupt copy of the data on one of the data nodes? Can Elasticsearch deal with this safely?

I have seen it work, but that's not to say it's fully supported.

I guess it will come down to how long the instance has been out of cluster, eg how old the shard/segment data is, so that recovery can either be partial or completee.

It deletes any extra shard data once the shard health is green.

It deletes the shard data that doesn't belong to an assigned shard.

There will, and yes Elasticsearch handles that safely.

2 Likes

Thanks for the info! I noticed when playing around with ECK too that when I killed a pod in the StatefulSet, it brought a new pod up automatically but there were no new persistent disks created in GCP (where I was testing). So I saw proof of this behavior in ECK. I wasn't testing how it would behave if there was data being indexed into it while this happened, but it sounds like it would have been fine.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.