However now the cluster cannot get up, because of the java.lang.IllegalStateException: alias [mongo] has more than one write index [mongo-2020.01.03-000033,mongo-2019.08.29-000015]
the cluster state dump has two nodes named "es-master-*", I wonder if you have had a split-brain situation due to this? Is discovery.zen.minimum_master_nodes set correctly to 2 for this setup or was it ever wrong?
It might look like one version of the cluster state has the mongo alias with write index for mongo-2020.01.03-000033 and the other has the mongo alias with write index for mongo-2019.08.29-000015. Looking at the date for the index name, maybe an old master has been resurrected and joined the cluster after the full restart?
Would be good to also see the log file from the other master node as well as settings (in particular minimum_master_nodes).
Thank you for your answer. The discovery.zen.minimum_master_nodes is currently set to 2, however i have no knowledge if it was always set to 2 in the past.
Also since the full restart was done by removing the k8s pods I don't believe that an old master could be resurrected, because I can't imagine how - maybe I am wrong though.
I am providing full logs and configurations of the cluster (I've hidden cluster name but its the same everywhere):
GET cluster/_state https://pastebin.com/ktsDxKCy
GET /_nodes https://pastebin.com/utRrzb7U
GET cluster/_settings?include_defaults=true https://pastebin.com/yR3RWGuv
Nodes:
All of the elasticsearch.yml are the same, but there different env values (provided in the comments of the config file)
I believe we need to manually fix this to get it running again. We should be able to find the UUID of the offending index by enabling trace logging (either globally or for org.elasticsearch.gateway).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.