We are using eck-operator and running Elasticsearch cluster on Kubernetes. We have a 3 master-eligible nodes and several worker nodes in the cluster.
In the case for running the cluster on 2 physical nodes, if a node goes down, we understand that Elasticsearch cluster is not resilient to failures. Resilience in small clusters | Elastic Docs
Our question is when we replace with a fresh new node without the previous data, is there a technical way that we can still recover the cluster with the index data kept? Now we are hitting a split-brain issue in the master node leader election because we replaced the physical node which hosted 2 master-eligible node which the majority is lost, so we have to re-establish the cluster to recover.
There is no automatic way to recover the data if you permanently lose a majority of master eligible master nodes. Sometimes I believe it may be possible to reconfigure the cluster, but that is not guaranteed to succeed and requires the manual use of the elasticsearch node utility, which is hard (maybe even impossible) to use with k8s, and can result in data loss.
In your scenario you will need to set up a new cluster and restore data from a recent snapshot taken using the official snapshot API to recover.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.