In elasticsearch 5.4.3 can't go to green the x-pack monitoring Shard Activity error Phase[1] phase1 failed


(andy_zhou) #1

in elasticsearch 5.4.3 the x-pack monitoring Shard Activity error,
the Files and Bytes zero, Translog is n/a
as so, the elasticsearch is yellow,and much time can't go to green. the recovery is very slow,about two hours can't go to green.
how i can do?


(andy_zhou) #2

in log

Caused by: org.elasticsearch.transport.RemoteTransportException: [xxx.xxx][xxx.xxxx:9302][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 5 more
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [1] files with total size of [1.2gb]
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 5 more
Caused by: org.elasticsearch.transport.ReceiveTimeoutTransportException: [datanode][dataip:9303][internal:index/shard/recovery/filesInfo] request_id [118881] timed out after [900000ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:934) ~[elasticsearch-5.4.3.jar:5.4.3]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.4.3.jar:5.4.3]
        ... 3 more

(andy_zhou) #3

some one can help me?


(andy_zhou) #4

update


(andy_zhou) #5

up!
thanks.


(andy_zhou) #6

update.
thanks.


(andy_zhou) #7

update


(Jason Tedor) #8

Please stop bumping.

From our community guidelines:

Be patient. This mostly applies to forums, mailing lists, and code contributions (i.e. asynchronous forms of communication). Communities are often built on volunteer time both from participants and organizers. It is possible that your question or code contribution or suggestion might not receive an immediate response. Be patient and consider the norms of the community. One reminder ping is welcome, many reminder pings in rapid succession are not a good display of patience. Similarly, posting the same question in multiple threads is frowned upon and should not be done.

Note that you posted going into a weekend, and a holiday weekend in the US no less.


(andy_zhou) #9

update


(andy_zhou) #10

change samll of
"cluster.routing.allocation.node_initial_primaries_recoveries":"100",
"cluster.routing.allocation.node_concurrent_recoveries": "100",

there is no error.


(andy_zhou) #11

update


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.