We've been using ECK late last year, started with the beta operator, moved to 1.0, and now we're on 1.1.2, each time doing a complete uninstall/reinstall of the operator, and have updated the cluster to various versions in between, currently on 7.8. About 3-4 times over the past year, usually about 3-4 months into the cluster working fine, it will randomly die in the middle of night, and it acts as though it's spinning up a new cluster, it releases it's PVC's and recreates all of it's secrets and tries to init a new cluster, which always fails, and stays in a failed state until I can delete it and re-recreate it. Has anyone else experienced this behavior? I would really like to understand why this is happening as it makes for some unpleasant late night calls.
Here are the logs from the operator when this begins to happen: