Volume Expansion StatefulSet Recreation Logic

Hi,

I've been looking into how the ECK operator upsizes the persistent volumes being used by the ES cluster. Looking at the PR that added this functionality and reading the comments I wanted to confirm my understanding is correct that the StatefulSets' recreation is coordinated with special annotations added to the same ES custom resource object that the operator listens to for performing cluster updates. If so, what stops a client from overwriting these annotations and interfering with the StatefulSet recreation?

I was also wondering why this approach was taken rather than blocking to sequentially delete and recreate each relevant StatefulSet, avoiding using k8s annotations to coordinate recreation entirely?

I haven't dug too deeply into the codebase so was hoping to get some pointers on where my understanding may be off.

Thanks!

Nothing. However the stateful set would still be recreated eventually based on the Elasticsearch spec. I would consider this a malicious client and we cannot guarantee correct operation in this case.

The motivation for the annotation based approach stems from this comment:

We want to avoid a situation where we end up with orphaned Pods that remain after a user decides to rename a node set (and thus make the now deleted stateful set obsolete) in the middle of the resizing process. Admittedly somewhat of an edge case but still possible. An earlier version of the implementation deleted and recreated the StatefulSets directly. Another reason we don't do that in a blocking way is that we cannot guarantee that the operator will not be interrupted in the middle (e.g. manual restart, OOM, operator upgrade etc). Any algorithm we use needs to be resilient against that. Also we try to avoid long blocking operations in the operator as it reduces the operators ability to process changes for other Elasticsearch clusters while we are blocking.

Hope that makes sense!

Fantastic, thank you for the quick and thorough reply -- that makes sense to me!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.