Why does a restart performs recovery which takes long time (6-12hrs)?

eightnoteight · December 26, 2018, 11:21am

Hi,

So our cluster consists of 4k indices and about 20k shards spanned over 7 nodes. upon restarting an elasticsearch node most of the indices(99.98) get to available state within 5 mins. but everytime there are a few indices which are greater than 100G get stuck in the recovery for hours. sometimes upto 12hrs.

about 100 indices are read/write indices and rest all are read only indices.
the write indices have 1 replica, but the read only indices have 0 replicas.

What happens during a restart?

Christian_Dahlqvist · December 26, 2018, 12:41pm

I suspect that at least part of the reason is that you have far too many shards for a cluster that size. Please read this blog post around shards and sharing practices, and then work to reduce this dramatically, e.g. by reindexing into fewer and larger indices.

nik9000 · December 26, 2018, 2:50pm

When nodes restart their shard copies are subtracted from the count of copies. The cluster will wait a while for the node to come back, and, when it does, the node will try to use the copy of the shard that it has on disk as a real shard copy. To do that it need to make sure its shard copy has everything that happened while it was gone. It can do this by:

Syncing all of the files that hold the index from the primary to itself, using the files it has on the disk when they match. This isn't as terrible as it sounds, but it can be quite slow if there have been many writes since the last time it happened. This is almost certainly what is happening in your big indices. This was Elasticsearch's original recovery mechanism and is still used today when other mechanisms fail.
Relying a "synced flush" that marks that the state of an index, promising that all of the files on disk have all of the operations that the primary has. This flush is automatically applied when an index hasn't been written to for a while but can also be manually applied. It should make indices that are effectively read only recover mostly instantly. But if there are any writes to the shard while the node is down Elasticsearch has to fall back to copying files.
It should be possible to replay changes that happened while the node was away, but I've not been paying much attention to how implementing this is going. That'd allow faster recoveries even with changes so long as not too many changes happened while it was gone. You aren't getting that because this process is slow.

Have a look at the instructions for doing a rolling restart update. For the most part they should apply if you only want to restart a single node.

system · January 23, 2019, 2:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow initialisation time after restart Elasticsearch	11	2141	June 1, 2017
How to avoid/lighten shard recovery after restart? Elasticsearch	2	460	July 6, 2017
Restarting of node taking much time Elasticsearch	6	2479	July 6, 2017
Elasticsearch quick recovery after restart Elasticsearch	3	530	July 6, 2017
Correct way to restart a cluster? Elasticsearch	4	475	July 6, 2017

Why does a restart performs recovery which takes long time (6-12hrs)?

Related topics