Slow Merge operation blocking node recovery on startup

Ankush_Jhalani · September 18, 2013, 3:29pm

We have using v0.90.1 for a 4 node cluster - 1 data, 1 client ES running on
each node. We have an index ~80 GB in size, 5 shards each with 3 replica.
Less than a 1% data changes everyday. All the merge settings are default.

What we are noticing that when we bring down ES and start it again, it can
take upto an hour to completely start this index and go from status 'yellow'
-> 'green'. I turned on debug trace and noticed on starting the node, each
shard is merging and taking from 17-30 minutes. This seems to be happening
even when we don't have any new indexing is going on.

We were hoping that syncing up time on node startup should be very fast,
but this really slows things down and is very confusing.

Why is merge operation happening during a node startup? I would think
merge should try to schedule at a low-activity period, which a node startup
is clearly not.
Why the merge seems to block node recovery during startup?

Can we reschedule merge to happen after node recovery completes or to not
block node recovery. I have spent a lot of time trying to debug this issue
so any help would be much appreciated. thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · September 18, 2013, 9:57pm

Hey,

by default recovery is throttled on elasticsearch 0.90 - maybe this kicks
in in your case (if you are using SSDs this is really slow)? Do you have
some monitoring in place to find out current read speeds?

See the recovery section at

--Alex

On Wed, Sep 18, 2013 at 5:29 PM, Nakul ankush.jhalani@gmail.com wrote:

We have using v0.90.1 for a 4 node cluster - 1 data, 1 client ES running
on each node. We have an index ~80 GB in size, 5 shards each with 3
replica. Less than a 1% data changes everyday. All the merge settings are
default.

What we are noticing that when we bring down ES and start it again, it can
take upto an hour to completely start this index and go from status 'yellow'
-> 'green'. I turned on debug trace and noticed on starting the node,
each shard is merging and taking from 17-30 minutes. This seems to be
happening even when we don't have any new indexing is going on.

We were hoping that syncing up time on node startup should be very fast,
but this really slows things down and is very confusing.

Why is merge operation happening during a node startup? I would think
merge should try to schedule at a low-activity period, which a node startup
is clearly not.

Why the merge seems to block node recovery during startup?

Can we reschedule merge to happen after node recovery completes or to not
block node recovery. I have spent a lot of time trying to debug this issue
so any help would be much appreciated. thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Large/slow merge blocking node recovery during startup Elasticsearch	1	368	July 6, 2017
Slow startup (replica recovery in logs) Elasticsearch	11	1792	July 6, 2017
Slow cluster startup (again) Elasticsearch	5	3124	July 6, 2017
Is it me or is ES 1.6.0 node startup/recovery slower then before? Elasticsearch	15	1078	July 6, 2017
Why does it take time for an Elasticsearch node to go "green" after being restarted? Elasticsearch	1	632	July 6, 2017

Slow Merge operation blocking node recovery on startup

Related topics