Team,
We have two node ES cluster which has been upgraded to 5.6.10 and we are having issue with shards in upgrading ES 5.6.10 to ES 6.3, in the both models either Rolling upgrade or full cluster restart we couldn't success.
[2018-06-20T06:06:00,590][WARN ][o.e.c.r.a.AllocationService] [host01.xxx] failing shard [failed shard, shard [my-201708][2], node[s-PbDSILSmKRYlTO4_FOvw], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=aqQb_CrCS-WmyRcrSFtF6w], unassigned_info[[reason=ALLOCATION_FAILED], at[2018-06-20T06:05:59.714Z], failed_attempts[4], delayed=false, details[failed shard on node [s-PbDSILSmKRYlTO4_FOvw]: failed recovery, failure RecoveryFailedException[[my-201708][2]: Recovery failed from {host01.xxx}{wsV326J5QxGdk31A7XY38w}{dhXjygn0Sg-tcKICCSRkiw}{host01.xxx}{150.0.1.242:9300}{ml.machine_memory=32898998272, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {host02.xxx}{s-PbDSILSmKRYlTO4_FOvw}{VAJjb_PvTbe03nBZE_HBKA}{host02.xxx}{150.0.1.149:9300}{ml.machine_memory=32898998272, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[host01.xxx][150.0.1.242:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[host02.xxx][150.0.1.149:9300][internal:index/shard/recovery/prepare_translog]]; nested: IllegalStateException[commit doesn't contain history uuid]; ], allocation_status[no_attempt]], message [failed recovery], failure [RecoveryFailedException[[my-201708][2]: Recovery failed from {host01.xxx}{wsV326J5QxGdk31A7XY38w}{dhXjygn0Sg-tcKICCSRkiw}{host01.xxx}{150.0.1.242:9300}{ml.machine_memory=32898998272, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {host02.xxx}{s-PbDSILSmKRYlTO4_FOvw}{VAJjb_PvTbe03nBZE_HBKA}{host02.xxx}{150.0.1.149:9300}{ml.machine_memory=32898998272, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[host01.xxx][150.0.1.242:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[host02.xxx][150.0.1.149:9300][internal:index/shard/recovery/prepare_translog]]; nested: IllegalStateException[commit doesn't contain history uuid]; ], markAsStale [true]]
org.elasticsearch.indices.recovery.RecoveryFailedException: [my-201708][2]: Recovery failed from {host01.xxx}{wsV326J5QxGdk31A7XY38w}{dhXjygn0Sg-tcKICCSRkiw}{host01.xxx}{150.0.1.242:9300}{ml.machine_memory=32898998272, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {host02.xxx}{s-PbDSILSmKRYlTO4_FOvw}{VAJjb_PvTbe03nBZE_HBKA}{host02.xxx}{150.0.1.149:9300}{ml.machine_memory=32898998272, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:282) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:80) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:623) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: org.elasticsearch.transport.RemoteTransportException: [host01.xxx][150.0.1.242:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] prepare target for translog failed
at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:191) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:98) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$000(PeerRecoverySourceService.java:50) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:107) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:104) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:246) ~[?:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:304) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.3.0.jar:6.3.0]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1592) ~[elasticsearch-6.3.0.jar:6.3.0]
... 5 more
We would like to upgrade to ES 6 to resolve CVEs in lucene, please advice on this.
Thanks,
Suresh Vytla