When the leader master node fail, it takes about a minute to elect a new leader. But the follower master node takes minutes to switch to the new leader. It's this expected? or any configuration can optimize this?
After the new leader and follower joined, when i added back the third master node, it can join the cluster in seconds.
I found this discussion. Looks like setting of masterTerminationFix: true can fix the issue. is this setting set in elasticsearch.yml? is it dynamic setting?
the cluster coordination code has been completely replaced in Elasticsearch 7, so it is hard to compare or setup with older versions. Can you share logfiles of your nodes showing that timeouts occur?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.