Node stuck in cluster after it crashed

nik9000 · December 17, 2013, 2:22am

One of my nodes crashed today we weren't able to start the machine again.
Sound like hardware problems. Any way, it is still listed in
_cluster/state and shards are trying to relocate to it. Bouncing another
node didn't remove the first node from the list.

Is there some way to force the master to check on the down machine? I'm
constantly getting this exception, which I assume is because that node is
down:
exception caught on transport layer [[id: 0x32f61132]], closing connection
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1L-Lh-Tb2n%3DYxOK64i2trw9L4ewHzHLHyGOveLDb04yQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

nik9000 · December 17, 2013, 6:53pm

On Mon, Dec 16, 2013 at 9:22 PM, Nikolas Everett nik9000@gmail.com wrote:

One of my nodes crashed today we weren't able to start the machine again.
Sound like hardware problems. Any way, it is still listed in
_cluster/state and shards are trying to relocate to it. Bouncing another
node didn't remove the first node from the list.

Is there some way to force the master to check on the down machine?

I found a way: restart the master and let another master take over.
Blunt, but effective.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3UjcVeqdhs4%3D8dkT1%3DX6TBqKU8TWZberejpHB7w9x0eA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Our cluster crushed when a node cannot reached Elasticsearch	1	454	July 5, 2017
Transport Client and closed node Elasticsearch	1	759	July 5, 2017
Help for removing a crashed node? Elasticsearch	5	1054	July 5, 2017
(ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed Elasticsearch	10	733	July 6, 2017
No node available, a really big headache! Elasticsearch	2	318	July 6, 2017

Node stuck in cluster after it crashed

Related topics