How to avoid/lighten shard recovery after restart?


(R. Toma) #1

Hi group,

Restarting a ES cluster triggers recovery which is long-lasting and load
expensive. I am searching for a way to reduce the runtime and load of a
restart. I read someone executes daily rolling restarts of his large ES
cluster to ensure the primary and replica shards are 100% indentical,
meaning they will be fast recoverable. But that sounds like a hack and not
something you should happy with as SRE. And its impact on ES performance
may be acceptable on a large cluster, but not on our 3 node cluster.

How I believe shard recovery works: if ES spots differences between a
primary and its replica shard(s). It will rebuild the replica shard(s) as
an exact copy of the primary shard. Rebuiding results in lots of network
traffic and disk I/O.

We have a 3-node ES 1.0.1 cluster with 3k primary shards and 3k replica
shards. During a recent restart (to reduce heapsize to 31G to get
CompressedOops back) the recovery of the 1st node took the longest time (~6
hours). Recovery and the 2nd less (~2 hours) and the 3rd is quick (<1
hour). I believe recovery becomes faster after each node, because each
recovery ends with more replica shards as exact copies of their primary.

I tried force-merging with an expensive max_num_segments=1, but the metrics
segments.count + segments.memory of same shards still differ between pri +
rep. No luck. For the curious few I have included the before + after
results below.

Any ideas?

Regards,
Renzo

BEFORE:
idx shard prirep docs store segments.count
segments.memory
logstash-pro-oracle-2014.04.24 0 p 1072 485592 8
14615
logstash-pro-oracle-2014.04.24 0 r 1072 449022 1
11958
logstash-pro-oracle-2014.04.24 1 p 1095 493774 7
14336
logstash-pro-oracle-2014.04.24 1 r 1095 459966 1
11988
logstash-pro-oracle-2014.04.24 2 p 1039 452078 5
13158
logstash-pro-oracle-2014.04.24 2 r 1039 458513 6
13480
logstash-pro-oracle-2014.04.24 3 p 1094 492753 8
14574
logstash-pro-oracle-2014.04.24 3 r 1094 483347 6
13850
logstash-pro-oracle-2014.04.24 4 p 1099 494740 8
14645
logstash-pro-oracle-2014.04.24 4 r 1099 488953 7
14251

AFTER:
idx shard prirep docs store segments.count
segments.memory
logstash-pro-oracle-2014.04.24 0 p 1072 449358 1
11958
logstash-pro-oracle-2014.04.24 0 r 1072 448884 1
11958
logstash-pro-oracle-2014.04.24 1 p 1095 460391 1
11980
logstash-pro-oracle-2014.04.24 1 r 1095 459918 1
11988 <-- rep is 8 bigger than its pri
logstash-pro-oracle-2014.04.24 2 p 1039 431341 1
11580
logstash-pro-oracle-2014.04.24 2 r 1039 431695 1
11572 <-- rep is 8 smaller than its pri
logstash-pro-oracle-2014.04.24 3 p 1094 457135 1
11907
logstash-pro-oracle-2014.04.24 3 r 1094 457970 1
11907
logstash-pro-oracle-2014.04.24 4 p 1099 457640 1
11957
logstash-pro-oracle-2014.04.24 4 r 1099 457165 1
11957

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20ca0eb4-1f62-4465-a289-2ecd740c9c2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Binh Ly-2) #2

Although you cannot completely eliminate these recovery/comparisons at the
moment, there are some things you can do today that may help. If you go
through this presentation, it talks about some settings on when ES should
start recovery after full restart:

http://www.elasticsearch.org/webinars/elasticsearch-pre-flight-checklist/

It is also possible to disable allocation before shutdown and then
re-enabling after you are fully back up.

And in the future, there is current work that is being done to make this
recovery process more efficient (see short description about Sequence
Numbers):

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56acf4b2-dc40-4cef-a1de-233c250156bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3