Configuration params to address slow node start

Hey there,

I am working on a single node cluster that currently has about 700 indexes each with 5 shards. I am seeing slow starts of the Elasticsearch node (up to 4m30s). I have found the following parameter in documentation:

cluster.routing.allocation.node_initial_primaries_recoveries

(see https://www.elastic.co/guide/en/elasticsearch/reference/5.x/shards-allocation.html).

I see it's default value is 4 allowing for 4 concurrent threads to recover primaries from disk. I bumped this up to a higher number and I am getting way better node restart times (starting now in about 1m30s with setting value equal to 128). I am completely winging it on the number of threads here and playing with numbers from 16 to as high as 256 (although I noticed that for my example once I reach a certain number of threads I no longer gain any significant improvement).

Is there any maximum number recommendation for this parameter (maybe cores on server?). I wonder why 4 is the default?

Also, is there any other configuration parameter you know about that might help speed up the node restart?

Thanks,

Francisco.

Are you on SSDs or spinning disks? Can you take a stack dump (jstack) during the slow recovery and share it here?

Hey Jason,

Thanks for the quick reply.

We are on spinning disks. I have a zip file containing captures of jstack.
Here is what I did:

  1. Setup ES with param set to 4. Started ES, captured jstack every 5s.
  2. Setup ES with param set to 128. Started ES, captured jstack every 5s.

Each directory inside the zip file will contain the captured jstack outputs for the parm setting. Files are named based on capture number. I also included the ES starting log so it can be cross-referenced with jstack captures.

I tried uploading the zip file to the forum but it did not allow me to do so. I think having these captures every 5s may give you more info than an isolated capture that might not see the issue. Is there any preferred way for me to share this zip file?

Please let me know if there is anything else I can supply to help understand this scenario.

Thanks,

Francisco.

Hey there,

Any of the jstack outputs is too big for the forum message limit... Also, you would only get one and might miss the whole picture. I have placed a zip file with the output of the 5s capture here:

https://www.dropbox.com/s/hhxbizu2o8n7gaf/jstack-results.zip?dl=0

Big waiting time happens somewhere between ES log line:

[2017-01-09T10:00:14,285][INFO ][o.e.g.GatewayService ] [WIN2K12R2IMAGE] recovered [665] indices into cluster_state

and log line:

[2017-01-09T10:04:32,149][INFO ][o.e.c.r.a.AllocationService] [WIN2K12R2IMAGE] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[...]).

Thanks,

Francisco.

Your node is indeed spending a lot of time recovering the shards from disk (reading shard metadata, reading the Lucene commit point, etc.). Other than what you've done here, short of reducing the number of indices or getting faster disks, there is not much that you can do here.

Thanks Jason! So no other config params that might help us here?

We are setting 'cluster.routing.allocation.node_initial_primaries_recoveries' to 128 with good results. Would you have any objections against that number?

Thanks again,

Francisco.

Spinning disks do not like concurrency, my only concern would be about thrashing the disk.

Thanks Jason! :slight_smile:

You're very welcome.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.