Cluster High-Availability

fim · May 7, 2019, 8:02am

I suppose to have a problem in my configuration that the master node is not switching properly.

We operate a two node cluster (test environment), later it will be a three-node cluster (production environment). For testing purposes I stopped a node in the two node cluster environment. When I CURL or Postman the cluster I'll receive the following error messages:

[user01@host02 ~]$ curl -XGET elastic:elastic@127.0.0.1:9200/_cluster/health?pretty
    {
      "error" : {
        "root_cause" : [
          {
            "type" : "master_not_discovered_exception",
            "reason" : null
          }
        ],
        "type" : "master_not_discovered_exception",
        "reason" : null
      },
      "status" : 503
    }

Even I can't explain why Kibana shows Cluster Status "Green"

May somebody help me with this issue?

warkolm · May 7, 2019, 8:17am

What version are you on? What do your logs show?

fim · May 7, 2019, 8:31am

ES version 7.0.0

[2019-05-07T10:25:25,099][DEBUG][o.e.a.a.c.s.TransportClusterStateAction] [sag-tst-es-002.sag.services] timed out while retrying [cluster:monitor/state] after failure (timeout [30s])
[2019-05-07T10:25:25,099][WARN ][r.suppressed             ] [sag-tst-es-002.sag.services] path: /_cluster/settings, params: {include_defaults=true}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.0.0.jar:7.0.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-05-07T10:25:25,100][DEBUG][o.e.a.a.c.s.TransportClusterStateAction] [sag-tst-es-002.sag.services] no known master node, scheduling a retry
[2019-05-07T10:25:25,105][WARN ][r.suppressed             ] [sag-tst-es-002.sag.services] path: /_monitoring/bulk, params: {system_id=kibana, system_api_version=6, interval=10000ms}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
        at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:191) ~[elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:177) ~[elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:55) ~[?:?]
        at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:35) ~[?:?]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:145) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$apply$0(SecurityActionFilter.java:86) [x-pack-security-7.0.0.jar:7.0.0]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$authorizeRequest$4(SecurityActionFilter.java:171) [x-pack-security-7.0.0.jar:7.0.0]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.security.authz.AuthorizationService.lambda$authorizeAction$4(AuthorizationService.java:238) [x-pack-security-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.security.authz.AuthorizationService$AuthorizationResultListener.onResponse(AuthorizationService.java:604) [x-pack-security-7.0.0.jar:7.0.0]
        at org.elasticsearch.xpack.security.authz.AuthorizationService$AuthorizationResultListener.onResponse(AuthorizationService.java:579) [x-pack-security-7.0.0.jar:7.0.0]
...

DavidTurner · May 7, 2019, 8:33am

A two-node cluster cannot be made highly available, because you always require a majority of master-eligible nodes to be available and a majority of two nodes is two nodes. You need at least three nodes for high availability. This isn't a limitation in Elasticsearch so much as a fundamental property of distributed systems.

system · June 4, 2019, 8:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch cluster of 4 nodes has "master not discovered exception" Elasticsearch	18	28557	May 18, 2018
ES 2.3.3 node cannot join the master node Elasticsearch	5	2139	July 5, 2017
We have cluster of 4 nodes, where 2 nodes are master and data and other 2 nodes are data nodes, the configuration was working fine since 2 yrs, today we have to restart the cluster and since then we are getting master not discovered exception Elasticsearch elastic-stack-monitoring	26	781	September 8, 2023
Master Not Discovered Exception : FailedToCommitClusterStateException: node is no longer master for term 55 while handling publication Elasticsearch	2	917	March 14, 2021
Upgrade cluster to 7.9.2, master_not_discovered_exception Elasticsearch	4	811	November 5, 2020

Cluster High-Availability

Related topics