Node in cluster is bursted - V 2.2.0

Pradeep_Gowda · October 10, 2016, 9:41pm

Hi All,
Recently one of the Node in cluster went into a bad state which I have never seen before. There are three nodes in the cluster. The cluster was in the good state for a month. I do all operations such as creating the index, deleting index on a single node. But suddenly this node got into the bad state with the following behavior.

Index creation was fine through the Client. But the deleting operation was failing with indexnotfoundexception. But actually, the index was present.
Creating and deleting alias was failing using the client. But through curl, I was able to create and delete the alias.

I was able to see tons of below exception in elastic log

 Caused by: [indexname][[indexname][3]] CreateFailedEngineException[Create failed for [indexname#AVUnhw1tQK5-eQ90nvsB]]; nested: NoSuchFileException[/opt/data/elasticsearch/cluster/nodes/0/indices/indexname/3/index/write.lock];
at org.elasticsearch.index.engine.InternalEngine.create(InternalEngine.java:367)
at org.elasticsearch.index.shard.IndexShard.create(IndexShard.java:515)
at org.elasticsearch.index.engine.Engine$Create.execute(Engine.java:810)
at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnReplica(TransportIndexAction.java:195)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:436)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:68)
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncReplicaAction.doRun(TransportReplicationAction.java:365)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicaOperationTransportHandler.messageReceived(TransportReplicationAction.java:270)
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicaOperationTransportHandler.messageReceived(TransportReplicationAction.java:267)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Caused by: java.nio.file.NoSuchFileException: /opt/data/elasticsearch/cluster/nodes/0/indices/indexname/3/index/write.lock

[2016-06-20 12:10:42,159][WARN ][gateway ] [Node0] [indexName][0]: failed to list shard for shard_store on node [Hs_tcXRfR9OVMPNzCIV53w]
FailedNodeException[Failed node [Hs_tcXRfR9OVMPNzCIV53w]]; nested: RemoteTransportException[[Node3][10.13.96.29:9300][internal:cluster/nodes/indices/shard/store[n]]]; nested: IllegalStateException[[indexName][0] index UUID in shard state was: hcPBOHlYTw2cLuQLgoCbqQ expected: 8foEM_sNQcaiOLf0hhByrQ on shard path: /opt/data/elasticsearch/cluster/nodes/0/indices/indexName/0];

Same create and delete operation works fine through curl.

Please let me know if you need more information.

Topic		Replies	Views
LockObtainFailedException in ES Elasticsearch	8	3073	June 2, 2017
Failed to create shard exception Elasticsearch	19	8185	July 5, 2017
ES nodes crashing: failed to send failed shard Elasticsearch	6	2519	July 5, 2017
One index is having problem here is my cluster status and logs Elasticsearch	2	661	July 6, 2017
My online cluster frequently suffered from A lot many so sucked LockObtainFailedException Elasticsearch	1	508	February 23, 2018

Node in cluster is bursted - V 2.2.0

Related topics