Rolling Restart an Elasticsearch Cluster with Ansible

I've come up with what I think is a safe way to rolling restart an
Elasticsearch cluster using Ansible handlers.

Why is this needed?

Even if you use a serial setting to limit the number of nodes processed
at one time, Ansible will restart elasticsearch nodes and continue
processing as soon as the elasticsearch service restart reports itself
complete. This pushes the cluster into a red state due to muliple data
nodes being restarted at once, and can cause performance problems.

Solution:

Perform a rolling restart of each elasticsearch node and wait for the
cluster to stabilize before continuing processing nodes. This set of
chained handlers restarts each node while keeping the cluster from
thrashing on reallocating shards during the process.

A gist of the handlers/main.yml file for my Elasticsearch role is at

I welcome any comments and/or suggestions.

--[Lance]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed35d8af7b9fb038b5b821b855ad75ba%40webmail.bearcircle.net.
For more options, visit https://groups.google.com/d/optout.

1 Like

Hello Lance,

This looks really nice and useful, thanks for sharing!

When yous end a cluster health command, you can also tell it to
wait_for_status=green or wait_for_status=yellow with a specified timeout:

This seems a nicer approach than to repeat the request and verify the
outcome from Ansible. But then again, if you just restarted a node, you
don't know when the node responds to requests at all. So waiting for green
might work on the last step, but maybe not when you restart each node (to
wait for yellow, since you disabled shard allocation), unless the init
script ensures ES is responsive when it returns successful.

Best regards,
Radu

On Wednesday, September 17, 2014 10:43:10 PM UTC+3, Lance A. Brown wrote:

I've come up with what I think is a safe way to rolling restart an
Elasticsearch cluster using Ansible handlers.

Why is this needed?

Even if you use a serial setting to limit the number of nodes processed
at one time, Ansible will restart elasticsearch nodes and continue
processing as soon as the elasticsearch service restart reports itself
complete. This pushes the cluster into a red state due to muliple data
nodes being restarted at once, and can cause performance problems.

Solution:

Perform a rolling restart of each elasticsearch node and wait for the
cluster to stabilize before continuing processing nodes. This set of
chained handlers restarts each node while keeping the cluster from
thrashing on reallocating shards during the process.

A gist of the handlers/main.yml file for my Elasticsearch role is at
Ansible rolling restart of Elasticsearch Cluster · GitHub

I welcome any comments and/or suggestions.

--[Lance]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eadcf01e-212b-4a5c-8b2d-6171d8085ebd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 Like