In elasticsearch 5.4.3 can't go to green the x-pack monitoring Shard Activity error Phase[1] phase1 failed

zqc0512 · September 1, 2017, 1:59pm

in elasticsearch 5.4.3 the x-pack monitoring Shard Activity error,
the Files and Bytes zero, Translog is n/a
as so, the elasticsearch is yellow,and much time can't go to green. the recovery is very slow,about two hours can't go to green.
how i can do?

zqc0512 · September 1, 2017, 3:10pm

in log

Caused by: org.elasticsearch.transport.RemoteTransportException: [xxx.xxx][xxx.xxxx:9302][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 5 more
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [1] files with total size of [1.2gb]
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 5 more
Caused by: org.elasticsearch.transport.ReceiveTimeoutTransportException: [datanode][dataip:9303][internal:index/shard/recovery/filesInfo] request_id [118881] timed out after [900000ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:934) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 3 more

zqc0512 · September 2, 2017, 8:56am

some one can help me?

zqc0512 · September 2, 2017, 2:39pm

update

zqc0512 · September 3, 2017, 12:27pm

up!
thanks.

zqc0512 · September 4, 2017, 12:37am

update.
thanks.

zqc0512 · September 5, 2017, 12:36am

update

jasontedor · September 5, 2017, 1:42am

Please stop bumping.

From our community guidelines:

Be patient. This mostly applies to forums, mailing lists, and code contributions (i.e. asynchronous forms of communication). Communities are often built on volunteer time both from participants and organizers. It is possible that your question or code contribution or suggestion might not receive an immediate response. Be patient and consider the norms of the community. One reminder ping is welcome, many reminder pings in rapid succession are not a good display of patience. Similarly, posting the same question in multiple threads is frowned upon and should not be done.

Note that you posted going into a weekend, and a holiday weekend in the US no less.

zqc0512 · September 11, 2017, 12:38am

update

zqc0512 · September 12, 2017, 2:32am

change samll of
"cluster.routing.allocation.node_initial_primaries_recoveries":"100",
"cluster.routing.allocation.node_concurrent_recoveries": "100",
there is no error.

zqc0512 · September 25, 2017, 2:58am

update

system · October 23, 2017, 2:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.