ES instance restart causing shard initialization

stecino · March 8, 2016, 1:47am

Hello,

I am running ES 2.1.2, Logstash 2.2. My cluster is 1 query node, 3 master node, 10 data nodes. I have 16GB allocated to ES. They are all dedicated nodes. My shard allocation is 5, with 1 replica. Data per index is about 450GB including the replica. Today I added repository nfs mount to my servers, and had to upgrade path.repo parameter which would require ES restart. ES restart every time is causing re-initialization of the shards. Any pointers what setting in my config may cause this issue?

So config snippet from my data nodes

gateway.recover_after_nodes: 1
gateway.recover_after_time: 5m
gateway.expected_nodes: 1

cluster.routing.allocation.node_initial_primaries_recoveries: 4
cluster.routing.allocation.node_concurrent_recoveries: 2
indices.recovery.max_bytes_per_sec: 20mb
indices.recovery.concurrent_streams: 5

warkolm · March 8, 2016, 2:57am

Set https://www.elastic.co/guide/en/elasticsearch/reference/2.2/delayed-allocation.html as well.
You will want to do a synched flush before restarting each node too, then you should be good.

stecino · March 8, 2016, 7:11pm

So I setup delayed allocation and ran the synced flush. Indecies that are not written to once the node comes back are handled as they need to. But the shards that are on presently written index, do go through initialization. So in the cases when you need a full cluster restart to enable a setting, in my case path.repo how do you handle it?

warkolm · March 8, 2016, 9:49pm

What you are doing is the best method, unless you can stop indexing temporarily (cause then the flush will help).

stecino · March 8, 2016, 11:31pm

Maybe introducing caching in front of logstash will help, as I will process everything and turn off logstash and let things to collect at caching layer, until I turn things on again.

So once shards initialized I am down to one:

index shard time type stage source_host target_host repository snapshot files files_percent bytes bytes_percent total_files total_bytes translog translog_percent total_translog
lwes-2016.03.08 4 2787147 replica translog ********* *********** n/a n/a 226 100.0% 25631075448 100.0% 349 99950918291 1789204 66.1% 2708072

shard it'self is 100% but translog isn't at this point can ignore this? Cluster because of this is still at yellow, but all the shard icons are showing green.

stecino · March 10, 2016, 10:30pm

Wanted to find out, should I run this sync flush everyday to make sure in case if node fails out of cluster for whatever reason or I plan to to bring down the node, I will be covered? Also, I have 3 different kind of indecies on the cluster, will sync flush apply to each index separately right, if I issue sync flush for all

warkolm · March 11, 2016, 10:41pm

What's what we recommend!

Curator can do that, but there is no need to flush an index once writes have finished to it.

A flush is per index, unless you use *.

Topic		Replies	Views
ES process restart causes full resync of all all shards to the restarted node Elasticsearch	4	753	July 6, 2017
Noob Question: Why is restarting a node anything other than instantaneous? Elasticsearch	7	414	July 6, 2017
Is it me or is ES 1.6.0 node startup/recovery slower then before? Elasticsearch	15	1124	July 6, 2017
Shard allocation on restarted node takes too long Elasticsearch	5	3507	July 5, 2017
Restarting one of the nodes resulted in unassigned shards Elasticsearch	4	2660	July 6, 2017

ES instance restart causing shard initialization

Related topics