org.elasticsearch.discovery.MasterNotDiscoveredException: nul

Everything was working fine until I restarted my cluster this weekend, post that same error is keep on coming no matter what I am doing..

First this issue came in prod and then UAT. not sure what is going on

My cluster is made of two nodes located in same region.

read almost all post related to this and seems everything is fine, still I am getting this error.
Even deleted the whole cluster and created again but issue persist.

I did below

  1. checked configuration to make sure everything is fine
  2. checked TCP port was accessible from one node to other using telnet
  3. change the port no itself to assure no one uses that tcp

is there something I should be doing?

What version are you on?
What OS?
How did you install Elasticsearch?
What does your config look like?
What does the entire log entry look like? Post the whole thing, not part of it.

What version are you on? - 5.1.2
What OS? - RHEL 6.8
How did you install Elasticsearch? - unzip the zip file
What does your config look like?
What does the entire log entry look like? Post the whole thing, not part of it
same error repeated

[2017-11-20T00:02:54,840][INFO ][o.e.e.NodeEnvironment ] [-na-uat] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-11-20T00:02:54,841][INFO ][o.e.n.Node ] [
-na-uat] node name [-na-uat], node ID [N50Yb36rR8-2aLxHCepXBQ]
[2017-11-20T00:02:54,844][INFO ][o.e.n.Node ] [
-na-uat] version[5.1.2], pid[7767], build[c8c4c16/2017-01-11T20:18:39.146Z], OS[Linux/2.6.32-696.6.3.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_74/25.74-b02]
[2017-11-20T00:02:55,813][INFO ][o.e.p.PluginsService ] [-na-uat] loaded module [aggs-matrix-stats]
[2017-11-20T00:02:55,813][INFO ][o.e.p.PluginsService ] [
-na-uat] loaded module [ingest-common]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [-na-uat] loaded module [lang-expression]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [
-na-uat] loaded module [lang-groovy]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [-na-uat] loaded module [lang-mustache]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [
-na-uat] loaded module [lang-painless]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [-na-uat] loaded module [percolator]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [
-na-uat] loaded module [reindex]
[2017-11-20T00:02:55,814][INFO ][o.e.p.PluginsService ] [-na-uat] loaded module [transport-netty3]
[2017-11-20T00:02:55,815][INFO ][o.e.p.PluginsService ] [
-na-uat] loaded module [transport-netty4]
[2017-11-20T00:02:55,815][INFO ][o.e.p.PluginsService ] [-na-uat] no plugins loaded
[2017-11-20T00:02:58,445][INFO ][o.e.n.Node ] [
-na-uat] initialized
[2017-11-20T00:02:58,445][INFO ][o.e.n.Node ] [-na-uat] starting ...
[2017-11-20T00:02:58,620][INFO ][o.e.t.TransportService ] [
-na-uat] publish_address {...:8903}, bound_addresses {0.0.0.0:8903}
[2017-11-20T00:02:58,627][INFO ][o.e.b.BootstrapCheck ] [
-na-uat] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-11-20T00:03:28,649][WARN ][o.e.n.Node ] [
-na-uat] timed out while waiting for initial discovery state - timeout: 30s
[2017-11-20T00:03:28,666][INFO ][o.e.h.HttpServer ] [
-na-uat] publish_address {...:8900}, bound_addresses {0.0.0.0:8900}
[2017-11-20T00:03:28,667][INFO ][o.e.n.Node ] [
-na-uat] started
[2017-11-20T00:03:28,967][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [-na-uat] no known master node, scheduling a retry
[2017-11-20T00:03:29,133][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [
-na-uat] no known master node, scheduling a retry
[2017-11-20T00:03:29,144][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [-na-uat] no known master node, scheduling a retry
[2017-11-20T00:04:28,971][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [
-na-uat] timed out while retrying [indices:admin/create] after failure (timeout [1m])
[2017-11-20T00:04:28,974][WARN ][r.suppressed ] path: /xxxx/xxxxx/xxxxx, params: {index=xxxxx-11062017, id=Bond-xxxx, type=xxxx}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:214) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:350) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:240) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:964) [elasticsearch-5.1.2.jar:5.1.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) [elasticsearch-5.1.2.jar:5.1.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_74]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_74]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_74]

Please show your config.

Please also format your code using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you :slight_smile:

Use a descriptive name for your cluster:

cluster.name: test

------------------------------------ Node ------------------------------------

Use a descriptive name for the node:

node.name: test-na-master
node.master: true
node.data: true
node.ingest: false

Add custom attributes to the node:

#node.attr.rack: r1

Lock the memory on startup:

#bootstrap.memory_lock: true

Make sure that the heap size is set to about half the memory available

on the system and that the owner of the process is allowed to use this

limit.

Elasticsearch performs poorly when the system is swapping the memory.

---------------------------------- Network -----------------------------------

Set the bind address to a specific IP (IPv4 or IPv6):

network.host: 0.0.0.0

Set a custom port for HTTP:

http.port: 8900

For more information, consult the network module documentation.

#:

--------------------------------- Discovery ----------------------------------

Pass an initial list of hosts to perform discovery when new node is started:

The default list of hosts is ["127.0.0.1", "[::1]"]

transport.tcp.port: 8903
discovery.zen.ping.unicast.hosts: ["host1:8901","host2:9201"]

Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):

discovery.zen.minimum_master_nodes : 2

For more information, consult the zen discovery module documentation.

---------------------------------- Gateway -----------------------------------

Block initial recovery after a full cluster restart until N nodes are started:

gateway.recover_after_nodes : 2

For more information, consult the gateway module documentation.

---------------------------------- Various -----------------------------------

Require explicit names when deleting indices:

#action.destructive_requires_name: true

indent preformatted text by 4 spaces

It was all good, only change I did was rather giving ip address of hosts, I changed to server name and tried to restart
but it started giving problem since yday night..

now its not working with ip address as well, sorry gave dummy host name and IP as we cant share all these publicly..

I feel its able to connect to TCP but somehow there is an issue in master election. can I know what would cause master election to fail?

Just a quick one, are you deploying ES in AWS by any chance?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.