ECK 1.0.1
k8s 1.16.9
The ECK operator's been stellar. But I ran into trouble deploying node resource and count changes at the same time.
In one cluster, it seems to have worked as expected, but in another the elasticsearch object is stuck applying changes:
NAME HEALTH NODES VERSION PHASE AGE
es-cluster green 17 7.9.3 ApplyingChanges 321d
The change requested increased the node count from 17 to 23 total and changed resources on the existing 17 nodes.
New nodes were successfully added, but there was a failed_predicates in each reconciliation attempt:
do_not_restart_healthy_node_if_MaxUnavailable_reached
And it listed all pre-existing data and master nodes as causes for failure
I was using the default Update Strategy and change budget, so it should have been able to add all new nodes immediately and terminate 1 node at a time. But it didn't attempt to terminate any existing nodes.
And after 29 reconciliation attempts over ~60 seconds, it stopped trying.
Is there a bug or known limitation in making both changes at the same time with that update strategy?
Is there a way to kick start the watcher again?
I've tried manually restarting nodes and it has no affect on the elasticsearch object
Thanks in advance!