Recover shard failed


(Ming Zhang) #1

Last night my cluster suffered blackout and all the nodes were down. My cluster consists of 1 master node and 2 data nodes. The number of replica is 1. Today I try to recover my cluster but Exceptions happen. This is what I see on the log of the master node:

[2017-10-19T14:50:50,159][WARN ][o.e.g.GatewayAllocator$InternalReplicaShardAllocator] [node-0] [133][0]: failed to list shard for shard_store on node [5zNKiEsyQZqL5gKXTT15ZQ]
org.elasticsearch.action.FailedNodeException: Failed node [5zNKiEsyQZqL5gKXTT15ZQ]
        at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.onFailure(TransportNodesAction.java:239) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.access$200(TransportNodesAction.java:153) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1.handleException(TransportNodesAction.java:211) ~[elasticsearch-5.6.3.jar:5.6.3]
        ...
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.13.Final.jar:4.1.13.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.transport.RemoteTransportException: [node-2][192.168.1.231:9300][internal:cluster/nodes/indices/shard/store[n]]
Caused by: org.elasticsearch.ElasticsearchException: Failed to list store metadata for shard [[133][0]]
        at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:114) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.nodeOperation(TransportNodesListShardStoreMetaData.java:64) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:140) ~[elasticsearch-5.6.3.jar:5.6.3]
        ...
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.3.jar:5.6.3]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
        ... 1 more
Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, expected suffix=5o, got=5q (resource=BufferedChecksumIndexInput(SimpleFSIndexInput(path="/home/es/elasticsearch-5.6.3/data/nodes/0/indices/ZKMGS386S8WKxMDBmyBRLA/0/index/segments_5o")))
        at org.apache.lucene.codecs.CodecUtil.checkIndexHeaderSuffix(CodecUtil.java:363) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:307) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:448) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:445) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:692) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:644) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:450) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
        at org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:130) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:198) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.index.store.Store.access$200(Store.java:126) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.index.store.Store$MetadataSnapshot.loadMetadata(Store.java:817) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.index.store.Store$MetadataSnapshot.<init>(Store.java:750) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.index.store.Store.readMetadataSnapshot(Store.java:416) ~[elasticsearch-5.6.3.jar:5.6.3]
        ...
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1539) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.6.3.jar:5.6.3]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.3.jar:5.6.3]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
        ... 1 more

Can I get the data back?

Thanks!


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.