SERVICE_UNAVAILABLE/2/no master

Hi, I am new to Elasticsearch and need your help to understad the issue. Here is my elasticsearch.yml

======================== Elasticsearch Configuration =========================

cluster.name: ELKcluster

node.name: node-ELK0-rb

node.attr.rack: r1b
node.master: false
node.data: false
node.ingest: false
cluster.routing.allocation.awareness.force.rack.values: r1a,r1c
cluster.routing.allocation.awareness.attributes: rack

path.data: /var/lib/elasticsearch

path.logs: /var/log/elasticsearch

network.host: 10.126.32.62
transport.host: 10.126.32.62

cluster.initial_master_nodes: ["node-ELK0", "node-ELK3", "node-ELK2"]

#discovery.zen.minimum_master_nodes: 2

xpack.ml.enabled: false
xpack.monitoring.enabled: false
#xpack.reporting.enabled: false
xpack.security.enabled: false
xpack.watcher.enabled: false

After the cluster worked fine for some time, like couple of months, it is now unavailable and here is the error message from the logs:
[2022-10-07T11:12:59,141][WARN ][r.suppressed ] [node-ELK0-rb] path: /_bulk, params: {}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized, SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:179) ~[elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.handleBlockExceptions(TransportBulkAction.java:635) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:481) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$2.onTimeout(TransportBulkAction.java:669) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.16.3.jar:7.16.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Suppressed: org.elasticsearch.discovery.MasterNotDiscoveredException
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) ~[elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.16.3.jar:7.16.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]

Please format your code/logs/config using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you :slight_smile:

How many nodes are in your cluster?

Hi,
I am new to Elasticsearch and need your help to understad the issue.
The cluster is deployed on AWS has 1 client node, 2 master nodes , and additional 2 data nodes

Here is my elasticsearch.yml on the client node:

cluster.name: ELKcluster

node.name: node-ELK0-rb

node.attr.rack: r1b
node.master: false
node.data: false
node.ingest: false
cluster.routing.allocation.awareness.force.rack.values: r1a,r1c
cluster.routing.allocation.awareness.attributes: rack

path.data: /var/lib/elasticsearch

path.logs: /var/log/elasticsearch

network.host: 10.126.32.62
transport.host: 10.126.32.62

cluster.initial_master_nodes: ["node-ELK0", "node-ELK3", "node-ELK2"]

#discovery.zen.minimum_master_nodes: 2

xpack.ml.enabled: false
xpack.monitoring.enabled: false
#xpack.reporting.enabled: false
xpack.security.enabled: false
xpack.watcher.enabled: false

After the cluster worked fine for some time, like couple of months, it is now unavailable and here is the error message from the logs available on the client node:

[2022-10-07T11:12:59,141][WARN ][r.suppressed ] [node-ELK0-rb] path: /_bulk, params: {}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized, SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:179) ~[elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.handleBlockExceptions(TransportBulkAction.java:635) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:481) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$2.onTimeout(TransportBulkAction.java:669) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.16.3.jar:7.16.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Suppressed: org.elasticsearch.discovery.MasterNotDiscoveredException
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:297) ~[elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.16.3.jar:7.16.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.16.3.jar:7.16.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]

Do all your nodes have enough disk space?

yes, the disc space should not be a problem.
Is there a way to debug the master discovery, to understand what is really happening?

Can you take a look at the logs from the current master node? Or are all nodes in your cluster having this issue?

Hello,
I restarted the service. It stoped because of an error, I looked at the corresponding logs that is throwing the exception:

[2022-10-11T08:07:14,747][ERROR][o.e.b.Bootstrap          ] [node-ELK2] Exception
org.elasticsearch.ElasticsearchException: failed to bind service
        at org.elasticsearch.node.Node.<init>(Node.java:1090) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) [elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) [elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) [elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) [elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) [elasticsearch-cli-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.Command.main(Command.java:77) [elasticsearch-cli-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) [elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) [elasticsearch-7.16.3.jar:7.16.3]
Caused by: java.nio.file.AccessDeniedException: /mnt/data/nodes
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:398) ~[?:?]
        at java.nio.file.Files.createDirectory(Files.java:700) ~[?:?]
        at java.nio.file.Files.createAndCheckIsDirectory(Files.java:807) ~[?:?]
        at java.nio.file.Files.createDirectories(Files.java:793) ~[?:?]
        at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:300) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:224) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:298) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.node.Node.<init>(Node.java:427) ~[elasticsearch-7.16.3.jar:7.16.3]
        ... 11 more
[2022-10-11T08:07:14,751][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-ELK2] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/mnt/data/nodes];
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.16.3.jar:7.16.3]
        at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.16.3.jar:7.16.3]
Caused by: org.elasticsearch.ElasticsearchException: failed to bind service
        at org.elasticsearch.node.Node.<init>(Node.java:1090) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.16.3.jar:7.16.3]
        ... 6 more
Caused by: java.nio.file.AccessDeniedException: /mnt/data/nodes
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:398) ~[?:?]
        at java.nio.file.Files.createDirectory(Files.java:700) ~[?:?]
        at java.nio.file.Files.createAndCheckIsDirectory(Files.java:807) ~[?:?]
        at java.nio.file.Files.createDirectories(Files.java:793) ~[?:?]
        at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:300) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:224) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:298) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.node.Node.<init>(Node.java:427) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.16.3.jar:7.16.3]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.16.3.jar:7.16.3]

Actually I checked, while directory /mnt/data exists, nodes folder is not found on this master.

Another master node restarts without an issue, the corresponding folder is present there but it is empty.

The both data nodes has the same issue exlained above.

Is there any comprehensive tutorial available how to set up Elasticsearch cluster that has various types of nodes?

Another question: I am reviewing the elasticsearch.yml files on different nodes. I am wondering if value/list of discovery.zen.ping.unicast.hosts should be the same within the cluster? Our cluster has 1 client, 2 data and 2 master nodes, it is deployed on AWS EC2 corresponding instances.

Finally, when calling elasticsearch status I get

elasticsearch dead but subsys locked

Should I delete the lock file ( elasticsearch) on subsys?

finally, the issue was that elasticsearch user had no access rights to the correct directory. Adding the correct permission resolved the issue

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.