hi team,
i have 3 nodes elasticsearch cluster, but at one time elasticsearch couldn't function properly because the master node left (reason = shutdown)
</>
[2023-09-13T01:14:28,394][INFO ][o.e.n.Node ] [IDNDCI-VSCSPSM1] started
[2023-09-13T01:20:09,510][INFO ][o.e.d.z.ZenDiscovery ] [IDNDCI-VSCSPSM1] master_left [{IDNDCI-VSCSPGR5}{0IS_JsPsTkK7wxlKZmYcGA}{6h50Ysw_QlS6xiXp_7W1ag}{IDNDCI-VSCSPGR5}{10.162.40.47:9300}], reason [shut_down]
[2023-09-13T01:20:09,510][WARN ][o.e.d.z.ZenDiscovery ] [IDNDCI-VSCSPSM1] master left (reason = shut_down), current nodes: nodes:
{IDNDCI-VSCSPGR5}{0IS_JsPsTkK7wxlKZmYcGA}{6h50Ysw_QlS6xiXp_7W1ag}{IDNDCI-VSCSPGR5}{10.162.40.47:9300}, master
{IDNDCI-VSCSPSM1}{Qi8DZwLOSCCGb_hCYK0ayQ}{eU-NYiPlRhKM8ezq6fxEYQ}{IDNDCI-VSCSPSM1}{10.162.40.18:9300}, local
{IDNDCI-VSCSPGR7}{Ut_237WpSnO9mpaB_ACDbw}{Vbu7dLhwTuWhX0nPajW1OQ}{IDNDCI-VSCSPGR7}{10.162.40.49:9300}
[2023-09-13T01:20:09,525][WARN ][o.e.t.n.Netty4Transport ] [IDNDCI-VSCSPSM1] write and flush on the network layer failed (channel: [id: 0x745ba491, L:0.0.0.0/0.0.0.0:9300 ! R:/10.162.40.47:62087])
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2023-09-13T01:20:10,544][WARN ][o.e.c.NodeConnectionsService] [IDNDCI-VSCSPSM1] failed to connect to node {IDNDCI-VSCSPGR5}{0IS_JsPsTkK7wxlKZmYcGA}{6h50Ysw_QlS6xiXp_7W1ag}{IDNDCI-VSCSPGR5}{10.162.40.47:9300} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [IDNDCI-VSCSPGR5][10.162.40.47:9300] connect_timeout[30s]
at org.elasticsearch.transport.netty4.Netty4Transport.connectToChannels(Netty4Transport.java:363) ~[?:?]
at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:570) ~[elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:473) ~[elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:342) ~[elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:329) ~[elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:154) [elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.cluster.NodeConnectionsService$1.doRun(NodeConnectionsService.java:107) [elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:675) [elasticsearch-5.6.16.jar:5.6.16]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.6.16.jar:5.6.16]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_271]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_271]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_271]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: IDNDCI-VSCSPGR5/10.162.40.47:9300
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[?:?]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:352) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
... 1 more
Caused by: java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[?:?]
at
</>
this is the config of elasticsearch.yaml
</>
bootstrap.memory_lock: true
cluster.name: prd_sm_es_cluster
discovery.zen.ping.unicast.hosts:
- IDNDCI-VSCSPSM1
- IDNDCI-VSCSPGR5
- IDNDCI-VSCSPGR7
http.port: 9200
</>
and after i change the elasticsearch.yaml like this :
</>
bootstrap.memory_lock: true
cluster.name: prd_sm_es_cluster
discovery.zen.ping.unicast.hosts:
- IDNDCI-VSCSPSM1:9300
- IDNDCI-VSCSPGR5:9300
- IDNDCI-VSCSPGR7:9300
http.port: 9200
</>
the nodes can communicate to master node again.
i need to find the rootcause of this issue
-the 9300 port is opened
-those 3 hosts are in 1 segment of network
-no port block from firewalls.
Thanks and regards,
Adira