Anyone using SystemD to ensure Elasticsearch is restarted if it crashes?

Hello all,

not sure how good of an idea it is but I'm thinking of adding a few lines to the SystemD unit file for Elasticsearch so the service will automatically restart if it dies.

Something like Restart

...
[Service]
...
Restart=always
RestartSec=3
...

Is anyone doing this? Looks like you can control the unit file template with $init_template in the Puppet module :slight_smile:

Anyway, any reason not to do this?

Cheers,
AB

If a node crashes it is because some catastrophe has happened and it is likely that it will crash again if it restarts. So it's better to be notified by the incident to analyze the cause so the node can come back online again safely (instead of going into an endless restart loop that will do more harm to the cluster then good)

Hi Thiago,

thanks for the quick reply :slight_smile:

I thought about that. I should probably spend more time on finding the root cause for my problems. Some sort of OOM issue brought down my 20 node cluster the other day. After the nodes were started again the cluster has been pretty stable for a few days. So a restart "fixed" the problem short term. But the longterm solution is to do the legwork and find the underlying issue...

Will hold off on restarting Elasticsearch automatically then.

Cheers,
AB

In this case, restarting could solve the issue. But what if it makes worse? You can not account for all the failure possibilities, so there is always a chance that restarting will make it worse. Because of this, it's simply better to keep monitoring the cluster and take proper action in case of failure.

Thank you Thiago for the advise.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.