Operator Upgrade causes nodes to be restarted


I noticed that upgrading the operator causes all the nodes in the cluster to be restarted. Perhaps it is because it changes the common.k8s.elastic.co/controller-version annotation on the Elasticsearch object.

This might be a source of trouble (downtime, performance degradation, etc). Is this some expected behavior and can it be avoided?


Hey @acondrat, this is explicitly documented there: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-upgrading-eck.html#k8s-ga-upgrade

We aim to avoid it in the future as much as we can, but any change in ECK impacting the Pod specs will trigger a rolling upgrade, by design.

Note you can set the following annotation on any existing Elasticsearch resource if you don't want ECK to reconcile them:

common.k8s.elastic.co/pause: true

thanks @sebgl , I missed that line in the docs!

Still, I think the Operator should not do anything if the spec of Elasticsearch did not change.