We are currently running version 2.4.0 in our production cluster. After node restarts I've noticed we have around 36 shards which seem to be stuck initializing.
We have been having problems with nodes crashing due to OOM errors. Overtime (and many node restarts later: -- This is separate issue I'm trying to address) we have ended up with a bunch of initializing shards which never seem to finish initializing. I'll update this with more info/logs when they become available.
I was wondering if there is anything I can do to address this issue other than delete the index / restart the node?
At this point the cluster is red. In this state, can this have a slowing impact on overall performance? Could this slow down indexing b/c the cluster is also busying trying to initialize shards?
Is there a way to determine if the shard has become corrupted and just won't ever initialize?
Running the cat/_recovery api tells me the following:
index shard time type stage source_host target_host repository snapshot files files_percent bytes bytes_percent total_files total_bytes translog translog_percent total_translog
xyz-index 0 596067535 store init 10.10.10.10 10.10.10.10 n/a n/a 0 0.0% 0 0.0% 0 0 0 -1.0% -1
Prior to this index/shard not initializing I verified it contained 57 docs and was about 23KB so I'm thinking it should have initialized pretty quickly.
Any thoughts are greatly appreciated!