Error : master_not_discovered_exception / ELK 7.17.12 - RHEL8

Hi,

I getting this error when i execute those command :

curl -X GET "http://SFAPRL17026.gestion.mrqgest:9200/_licence

curl -X GET "http://SFAPRL17026.gestion.mrqgest:9200/_cluster/health?pretty"

error text :

{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}

I have cluster elasticsearch with 5 noeuds : 2 master and 3 data.

All services in noeuds elasticsearch is started

Find below log in noeuds master and data :

[2024-06-25T11:12:38,123][ERROR][t.b.r.e.s.EsIndexJsonContentService] [server17026].-10000m
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:179) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.checkGlobalBlock(TransportSingleShardAction.java:112) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.(TransportSingleShardAction.java:146) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.(TransportSingleShardAction.java:130) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:98) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:51) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:186) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.ActionFilter$Simple.apply(ActionFilter.java:53) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:184) ~[elasticsearch-7.17.12.jar:7.17.12]
at tech.beshu.ror.es.handler.AclAwareRequestFilter$EsChain.continue(AclAwareRequestFilter.scala:301) ~[?:?]
at tech.beshu.ror.es.IndexLevelActionFilter.proceed(IndexLevelActionFilter.scala:133) ~[?:?]
at tech.beshu.ror.es.IndexLevelActionFilter.$anonfun$apply$1(IndexLevelActionFilter.scala:120) ~[?:?]
at map @ tech.beshu.ror.es.services.EsIndexJsonContentService.sourceOf(EsIndexJsonContentService.scala:61) ~[?:?]
at flatMap @ tech.beshu.ror.configuration.index.IndexConfigManager.load(IndexConfigManager.scala:36) ~[?:?]
at map @ tech.beshu.ror.boot.engines.MainConfigBasedReloadableEngine.loadRorConfigFromIndex(MainConfigBasedReloadableEngine.scala:130) ~[?:?]
at flatMap @ tech.beshu.ror.es.ReadonlyRestEsConfig$.load(ReadonlyRestEsConfig.scala:35) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$tryMainEngineReload$1(RorInstance.scala:193) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$tryMainEngineReload$1(RorInstance.scala:194) ~[?:?]
at runSyncUnsafe @ tech.beshu.ror.buildinfo.BuildInfoReader$.$anonfun$create$1(BuildInfoReader.scala:35) ~[?:?]
at fromAutoCloseable @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:53) ~[?:?]
at fromAutoCloseable @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:53) ~[?:?]
at use @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:54) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$scheduleEnginesReload$1(RorInstance.scala:148) ~[?:?]
at runAsync @ tech.beshu.ror.boot.RorInstance.$anonfun$scheduleIndexConfigChecking$1(RorInstance.scala:163) ~[?:?]
[2024-06-25T11:12:38,125][ERROR][t.b.r.e.s.EsIndexJsonContentService] [server17026].-10000m
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:179) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.checkGlobalBlock(TransportSingleShardAction.java:112) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.(TransportSingleShardAction.java:146) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.(TransportSingleShardAction.java:130) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:98) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:51) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:186) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.ActionFilter$Simple.apply(ActionFilter.java:53) ~[elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:184) ~[elasticsearch-7.17.12.jar:7.17.12]
at tech.beshu.ror.es.handler.AclAwareRequestFilter$EsChain.continue(AclAwareRequestFilter.scala:301) ~[?:?]
at tech.beshu.ror.es.IndexLevelActionFilter.proceed(IndexLevelActionFilter.scala:133) ~[?:?]
at tech.beshu.ror.es.IndexLevelActionFilter.$anonfun$apply$1(IndexLevelActionFilter.scala:120) ~[?:?]
at map @ tech.beshu.ror.es.services.EsIndexJsonContentService.sourceOf(EsIndexJsonContentService.scala:61) ~[?:?]
at flatMap @ tech.beshu.ror.configuration.index.IndexTestConfigManager.load(IndexTestConfigManager.scala:60) ~[?:?]
at map @ tech.beshu.ror.boot.engines.TestConfigBasedReloadableEngine.loadRorConfigFromIndex(TestConfigBasedReloadableEngine.scala:214) ~[?:?]
at flatMap @ tech.beshu.ror.es.ReadonlyRestEsConfig$.load(ReadonlyRestEsConfig.scala:35) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$tryTestEngineReload$1(RorInstance.scala:202) ~[?:?]
at runSyncUnsafe @ tech.beshu.ror.buildinfo.BuildInfoReader$.$anonfun$create$1(BuildInfoReader.scala:35) ~[?:?]
at fromAutoCloseable @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:53) ~[?:?]
at fromAutoCloseable @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:53) ~[?:?]
at use @ tech.beshu.ror.buildinfo.BuildInfoReader$.tryWithResources(BuildInfoReader.scala:54) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$scheduleEnginesReload$1(RorInstance.scala:149) ~[?:?]
at runAsync @ tech.beshu.ror.boot.RorInstance.$anonfun$scheduleIndexConfigChecking$1(RorInstance.scala:163) ~[?:?]
at map @ tech.beshu.ror.es.services.EsIndexJsonContentService.sourceOf(EsIndexJsonContentService.scala:61) ~[?:?]
at flatMap @ tech.beshu.ror.configuration.index.IndexConfigManager.load(IndexConfigManager.scala:36) ~[?:?]
at map @ tech.beshu.ror.boot.engines.MainConfigBasedReloadableEngine.loadRorConfigFromIndex(MainConfigBasedReloadableEngine.scala:130) ~[?:?]
at flatMap @ tech.beshu.ror.es.ReadonlyRestEsConfig$.load(ReadonlyRestEsConfig.scala:35) ~[?:?]
at map @ tech.beshu.ror.boot.RorInstance.$anonfun$tryMainEngineReload$1(RorInstance.scala:193) ~[?:?]
[2024-06-25T11:12:38,631][WARN ][o.e.d.PeerFinder ] [server17026].-10000m
[2024-06-25T11:12:38,632][WARN ][o.e.d.PeerFinder ] [server17026].-10000m
[2024-06-25T11:12:38,632][WARN ][o.e.d.PeerFinder ] [server17026].-10000m
[2024-06-25T11:12:38,855][DEBUG][o.e.a.s.m.TransportMasterNodeAction] [server17026].-10000m
[2024-06-25T11:12:38,855][WARN ][r.suppressed ] [server17026].-10000m
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.12.jar:7.17.12]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
at java.lang.Thread.run(Thread.java:1623) [?:?]
[2024-06-25T11:12:38,857][DEBUG][o.e.a.s.m.TransportMasterNodeAction] [server17026].-10000m
[2024-06-25T11:12:38,857][WARN ][r.suppressed ] [server17026].-10000m
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.12.jar:7.17.12]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
at java.lang.Thread.run(Thread.java:1623) [?:?]
[2024-06-25T11:12:38,857][DEBUG][o.e.a.s.m.TransportMasterNodeAction] [server17026].-10000m
[2024-06-25T11:12:38,857][WARN ][r.suppressed ] [server17026].-10000m
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.17.12.jar:7.17.12]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.12.jar:7.17.12]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
at java.lang.Thread.run(Thread.java:1623) [?:?]

Thanks in advance for help

It looks like you are running with a third-party plugin that is not supported here. I have no experience with this plugin, but if it is used to secure communication within the cluster it is possible that it is the cause of communication issues. If you remove or disable this so you have a standard Elasticsearch cluster it may be easier to get help here.

You may also want to consult the creator of the plugin or community around it.

But plugin ReadonlyRest is compatible with elaticsearch , so it already installed in other environnement production and préproduction without issue.

Many Thanks

As well as the third-party plugin (which may be "compatible" according to its developers but the ES developers offer no such claim) I think there's something else weird/nonstandard in your setup. This is definitely not a log message that regular Elasticsearch would emit:

In any case, this troubleshooting guide describes the information we'd need to see in order to help you. Also please format your posts better, it's almost impossible to read the logs you shared.

2 Likes