Hello everyone,
I've recently having issue moving the master role from node to node.
I got 3 node in my cluster and there are all 3 master-eligible nodes :
I first disable shard allocation on my second node and i let elasticsearch move all the shards across the 2 other node.
I then shutdown elasticsearch on the second node to perform a disk change on the VM and add storage (while changing the mounting point names)
But whenever i shutdown the master node, the role never got transfer to another node and i'm stuck with a cluster without any entry point available.
Logs from the others nodes :
[2020-01-08T11:43:38,012][WARN ][r.suppressed ] [hostname] path: /_bulk, params: {}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.handleBlockExceptions(TransportBulkAction.java:534) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:415) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$2.onTimeout(TransportBulkAction.java:568) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:325) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:598) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) [elasticsearch-7.5.1.jar:7.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
[2020-01-08T11:43:38,020][WARN ][r.suppressed ] [hostname] path: /_bulk, params: {}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.handleBlockExceptions(TransportBulkAction.java:534) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:415) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$2.onTimeout(TransportBulkAction.java:568) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:325) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:598) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) [elasticsearch-7.5.1.jar:7.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
[2020-01-08T11:43:38,141][WARN ][r.suppressed ] [hostname] path: /_bulk, params: {}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.handleBlockExceptions(TransportBulkAction.java:534) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:415) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$2.onTimeout(TransportBulkAction.java:568) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:325) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:598) [elasticsearch-7.5.1.jar:7.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) [elasticsearch-7.5.1.jar:7.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
My config file (same on the 3 node except the IP) :
cluster.name : MyCluster
node.name : ${HOSTNAME}
path.data : /data
path.logs : /var/log/elasticsearch
network.host : my_ip
http.port : 9200
discovery.seed_hosts : ["ip_node1","ip_node2","ip_node3"]
discovery.zen.minimum_master_nodes : 1
cluster.max_shards_per_node: 2000
xpack.monitoring.enabled: false
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: cert/path
xpack.security.transport.ssl.truststore.path: cert/path
If i restart my node who was previously master, the master role got switch at the moment the node restart
It's not the first time i'm having this issue and i never get an explanation on how to properly correct this issue or if there is a misconfiguration from my side ...