ES instance restart causing shard initialization


(Stecino) #1

Hello,

I am running ES 2.1.2, Logstash 2.2. My cluster is 1 query node, 3 master node, 10 data nodes. I have 16GB allocated to ES. They are all dedicated nodes. My shard allocation is 5, with 1 replica. Data per index is about 450GB including the replica. Today I added repository nfs mount to my servers, and had to upgrade path.repo parameter which would require ES restart. ES restart every time is causing re-initialization of the shards. Any pointers what setting in my config may cause this issue?

So config snippet from my data nodes

gateway.recover_after_nodes: 1
gateway.recover_after_time: 5m
gateway.expected_nodes: 1

cluster.routing.allocation.node_initial_primaries_recoveries: 4
cluster.routing.allocation.node_concurrent_recoveries: 2
indices.recovery.max_bytes_per_sec: 20mb
indices.recovery.concurrent_streams: 5


(Mark Walkom) #2

Set https://www.elastic.co/guide/en/elasticsearch/reference/2.2/delayed-allocation.html as well.
You will want to do a synched flush before restarting each node too, then you should be good.


(Stecino) #3

So I setup delayed allocation and ran the synced flush. Indecies that are not written to once the node comes back are handled as they need to. But the shards that are on presently written index, do go through initialization. So in the cases when you need a full cluster restart to enable a setting, in my case path.repo how do you handle it?


(Mark Walkom) #4

What you are doing is the best method, unless you can stop indexing temporarily (cause then the flush will help).


(Stecino) #5

Maybe introducing caching in front of logstash will help, as I will process everything and turn off logstash and let things to collect at caching layer, until I turn things on again.

So once shards initialized I am down to one:

index shard time type stage source_host target_host repository snapshot files files_percent bytes bytes_percent total_files total_bytes translog translog_percent total_translog
lwes-2016.03.08 4 2787147 replica translog ********* *********** n/a n/a 226 100.0% 25631075448 100.0% 349 99950918291 1789204 66.1% 2708072

shard it'self is 100% but translog isn't at this point can ignore this? Cluster because of this is still at yellow, but all the shard icons are showing green.


(Stecino) #6

Wanted to find out, should I run this sync flush everyday to make sure in case if node fails out of cluster for whatever reason or I plan to to bring down the node, I will be covered? Also, I have 3 different kind of indecies on the cluster, will sync flush apply to each index separately right, if I issue sync flush for all


(Mark Walkom) #7

What's what we recommend! :slightly_smiling:

Curator can do that, but there is no need to flush an index once writes have finished to it.

A flush is per index, unless you use *.


(system) #8