Hi,
INITIAL PROBLEM:
It seems that shards do not manage to finish their initialization.
I keep getting the following exception in the logs
00:00:02.100 WARN org.elasticsearch.indices.cluster - [00848fc0-8093-4f0f-975f-256b500cb423]
[[search-cms-repo][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [search-cms-repo][0]: Recovery failed from [160ab28e-2944-4800-b90e-f794b3099c5e][SQlcAKUfQ3G4Y6DcEOTWiw][f755503c733d][inet[/ip1:9300]]{local=false, master=true} into [00848fc0-8093-4f0f-975f-256b500cb423][a1ewguLMRiKrpisESiNpjQ][5f74740899fc][inet[/ip2:9300]]{local=false, master=true}
at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:274)
at org.elasticsearch.indices.recovery.RecoveryTarget.access$700(RecoveryTarget.java:69)
at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:550)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.transport.ReceiveTimeoutTransportException: [160ab28e-2944-4800-b90e-f794b3099c5e][inet[/ip1:9300]][internal:index/shard/recovery/start_recovery] request_id [32348290] timed out after [5000ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
... 3 common frames omitted
When I look at the size of a problematic shard on disk I can see that it keeps growing and then goes back to its initial size, every 5seconds
REPRODUCED LOCALLY:
I am not sure how this happened on the real server nodes (ES 1.5.2).
But I manage to reproduce a similar situation locally, consistently. The version of ES tried are 1.5.2 and 1.7.6 and both have the same issue (even though the exception message is slightly different)
- To reproduce locally I simply have a cluster of 2 nodes (ES 1.7.6).
- I reduced the speed of the recovery (indices.recovery.max_bytes_per_sec) to be sure that the replication of a shard takes more than 5seconds
- I have 8 indices (I tried with 4 first and could never reproduce. Maybe because of the way the shards are spread. But 8 seems to always do the trick)
- To force the cluster to replicate the shards, I lower the number of replicates from 1 -> 0, and then (when they are relocated properly) back to 1.
Most managed to pass to state "STARTED" except 1.
It keeps passing between the state "INITIALIZING" and "UNASSIGNED"
In the logs I keep getting
2019-03-01 14:35:34,907 WARN org.elasticsearch.indices.cluster - [node1] [[storage-cms-repo][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [storage-cms-repo][0]: Recovery failed from [node2][J5iywdFIQ-ia4y5hZiXpBw][glenns-mbp.lab.enonic.net][inet[/127.0.0.1:9301]]{local=false, master=true} into [node1][kyynt3ovTEOk2WM7PsEMSw][glenns-mbp.lab.enonic.net][inet[/127.0.0.1:9300]]{local=false, master=true}
at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:280)
at org.elasticsearch.indices.recovery.RecoveryTarget.access$700(RecoveryTarget.java:70)
at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:567)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.transport.RemoteTransportException: [node2][inet[/127.0.0.1:9301]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.transport.NotSerializableTransportException: [org.elasticsearch.indices.recovery.DelayRecoveryException] source node does not have the shard listed in its state as allocated on the node;
Here are the _cat/shards for each step
Initial situation:
After set replicas to 0
After setting back the replicas to 1
As you can see
Most managed to pass to state "STARTED" except 1.
It keeps passing between the state "INITIALIZING" and "UNASSIGNED"
In the logs I keep getting the warning log
I look at the size of a problematic shard on disk
If I restart the problematic node, the initialization finishes