Elasticsearch master node failover

artem.zikov · June 17, 2022, 11:43am

Hello. I have Elasticsearch cluster (for graylog) with Elasticsearch-oss 7.10.2
Cluster has 3 nodes: 2 data/master and 1 master

# curl -XGET 'localhost:9200/_cat/nodes?v&pretty'
ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.207.20.22           53          72   0    0.02    0.01     0.00 imr       *      monlog02p
10.207.20.24           34         100   7    0.70    0.49     0.53 dimr      -      monlog04p
10.207.20.25           30         100   2    0.32    0.38     0.43 dimr      -      monlog05p

All nodes can be master. But when i stopping current master node (monlog02p), cluster don`t elect new master

{"type": "server", "timestamp": "2022-06-17T10:56:25,382Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "graylog", "node.name": "monlog05p", "message": "master not discovered or elected yet, an election requires a node with id [bol3_--gRTeY40YE75jMEg], have discovered [{monlog05p}{Wks5aVZIQdKhh-UQfN6Kjw}{p_43YmiFTUWLsz2qi9zLcg}{10.207.20.25}{10.207.20.25:9300}{dimr}, {monlog04p}{RWc0y6fuRZOOy92Ufr5PJQ}{gwTE_agiSEGaLsUMZKm7XA}{10.207.20.24}{10.207.20.24:9300}{dimr}] which is not a quorum; discovery will continue using [10.207.20.24:9300, 10.207.20.22:9300] from hosts providers and [{monlog05p}{Wks5aVZIQdKhh-UQfN6Kjw}{p_43YmiFTUWLsz2qi9zLcg}{10.207.20.25}{10.207.20.25:9300}{dimr}, {monlog04p}{RWc0y6fuRZOOy92Ufr5PJQ}{gwTE_agiSEGaLsUMZKm7XA}{10.207.20.24}{10.207.20.24:9300}{dimr}, {monlog02p}{bol3_--gRTeY40YE75jMEg}{vRDVG-R9RVyx5XauH_4v9Q}{10.207.20.22}{10.207.20.22:9300}{imr}] from last-known cluster state; node term 52, last-accepted version 2847 in term 52", "cluster.uuid": "Ctv7jUkOS9q0f4FJhoy3Ow", "node.id": "Wks5aVZIQdKhh-UQfN6Kjw"  }

# curl -X GET "localhost:9200/_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions&pretty"
{
  "metadata" : {
    "cluster_coordination" : {
      "voting_config_exclusions" : [
        {
          "node_id" : "Wks5aVZIQdKhh-UQfN6Kjw",
          "node_name" : "monlog05p"
        },
        {
          "node_id" : "bol3_--gRTeY40YE75jMEg",
          "node_name" : "monlog02p"
        },
        {
          "node_id" : "_absent_",
          "node_name" : "node_name"
        },
        {
          "node_id" : "RWc0y6fuRZOOy92Ufr5PJQ",
          "node_name" : "monlog04p"
        }
      ]
    }
  }
}

what I should do?

DavidTurner · June 18, 2022, 2:44pm

You should clear your voting config exclusions. See these docs, particularly

Clusters should have no voting configuration exclusions in normal operation.

Also, 7.10 has passed EOL and is no longer supported. You should upgrade to a supported version as a matter of urgency.

artem.zikov · June 19, 2022, 6:27am

# curl -XDELETE 'localhost:9200/_cluster/voting_config_exclusions'
{"error":{"root_cause":[{"type":"timeout_exception","reason":"timed out waiting for removal of nodes; if nodes should not be removed, set waitForRemoval to false. [{monlog05p}{Wks5aVZIQdKhh-UQfN6Kjw}, {monlog02p}{bol3_--gRTeY40YE75jMEg}, {node_name}{_absent_}, {monlog04p}{RWc0y6fuRZOOy92Ufr5PJQ}]"}],"type":"timeout_exception","reason":"timed out waiting for removal of nodes; if nodes should not be removed, set waitForRemoval to false. [{monlog05p}{Wks5aVZIQdKhh-UQfN6Kjw}, {monlog02p}{bol3_--gRTeY

its doesnt work..
graylog supportns only this version(

DavidTurner · June 20, 2022, 1:58pm

I see, there's a small bug in the error message that #87828 fixes. It should read:

if nodes should not be removed, set ?wait_for_removal=false

Set that parameter and it should work.

artem.zikov · June 20, 2022, 2:29pm

curl -X DELETE "localhost:9200/_cluster/voting_config_exclusions?wait_for_removal=false"

thats work! great, thanks

system · July 18, 2022, 2:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Master Node Failover ..? Elasticsearch	2	944	July 6, 2017
Elasticsearch nodes did not elect master even after failure to discover master Elasticsearch	1	1165	July 5, 2017
Elasticsearch loses its Master every few minutes Elasticsearch	10	6142	July 5, 2017
Shutdown master means breakdown the cluster's service? Elasticsearch	8	2334	July 6, 2017
Master election issue? Elasticsearch	4	370	July 6, 2017

Elasticsearch master node failover

Related topics