Elasticsearch is not starting after the restart


(mramaprasad) #1

We have 3 nodes. nodes are not coming back up after the restart. I get following errors

nested: RecoverFilesRecoveryException[Failed to transfer [0] files with total size of [0b]]; nested: IllegalStateException[try to recover [csv3][1] from primary shard with sync id but number of docs differ: 471350 lnode01, primary) vs 471351(node2)]; ]

aused by: RemoteTransportException[[node2][172.16.168.93:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [0] files with total size of [0b]]; nested: IllegalStateException[try to recover [csv3][1] from primary shard with sync id but number of docs differ: 471350 (node01, primary) vs 471351 node2)];
Caused by: [csv3][[csv3][1]] RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [0] files with total size of [0b]]; nested: IllegalStateException[try to recover [csv3][1] from primary shard with sync id but number of docs differ: 471350 (node01, primary) vs 471351(node2];
at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:135)

Any idea how to fix this [lease?


(Abhijitdeka11) #2

can you try
/_cluster/health?pretty is there any unassigned_shards?


(mramaprasad) #3

Yes.

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'

{

"cluster_name" : "ixxx",

"status" : "yellow",

"timed_out" : false,

"number_of_nodes" : 3,

"number_of_data_nodes" : 3,

"active_primary_shards" : 3,

"active_shards" : 5,

"relocating_shards" : 0,

"initializing_shards" : 2,

"unassigned_shards" : 2,

"delayed_unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0,

"task_max_waiting_in_queue_millis" : 0,

"active_shards_percent_as_number" : 55.55555555555556

}


(mramaprasad) #4

After a while I see unassigned shards as 0. but initializing shards as 3.
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 3,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 3,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 66.66666666666666


(Abhijitdeka11) #5

If it's initializing then i guess the only thing you can try is restarting the nodes
Please see this blog
https://t37.net/how-to-fix-your-elasticsearch-cluster-stuck-in-initializing-shards-mode.html

To avoid this situation this is the permanent solution


(system) #6