I'm trying a simple 3-node cluster. All 3 are masters/data/ingest.
If I delete all 3 pods manually the pods will spin-up again, they will have the same names, however they'll fail to join the cluster. According to the logs they'll be looking for some completely different non-existent master nodes.
Is there a way to resolve this issue? i.e. to ensure that even in case of deletion the pod try to join the existing masters.
Where does the operator save information about the masters?
I'm removing pods, and they get re-created. I have PVC, so supposedly it's supposed to re-use the data from those PVCs, right?
Who keeps the information about what the suffix of the pod name should be?
Step-by-step scenario:
operator is spun up
3 pods for the ES cluster are running
Kill the 3 pods for the ES cluster (emulate cluster failure for example, or invasive cluster upgrade)
3 pods are spun up again with the same names, however none of them are working - each tries to connect to some fantom masters which don't exist anywhere.
Expectation:
Instead of step 4 the pods are spun back up, mount same volumes, and everything is back to normal.
If using ECK version 0.8 (or 0.8.1), you should name your volumeClaimeTemplates data. See the quickstart section about persistent storage. Otherwise these volumes won't be mapped to the actual Elasticsearch data directory, which will be using emptyDir (not persisted after pod deletion).
I guess you may have been misleaded by the doc for the master branch, in which we changed that volume name from data to elasticsearch-data. This will only apply starting ECK v0.9 (not released yet).
Sorry for the confusion, things should work better with
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.