High recovering time during rolling restart of Elasticsearch 6.2

rdevgamer · December 4, 2018, 3:01pm

Hi Team.

We have 6 elasticsearch server and all of them are data and 3 of them are marked as master eligible. Currently we have 5TB of data stored across these nodes, All of the 6 nodes are Load balanced using HAProxy.

In our platform these host require a OS restart (for some reason I have no clarity of). With reference to links below we framed list of instructions on how to perform rolling restart.

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/rolling-upgrades.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.2/restart-upgrade.html

Disable shard allocation and perform a synced flush
Shut down a single node
Did OS Restart, Removed old logs from /var/log/elasticsearch
Start the node
Re-enable shard allocation

It took more than 12 hours to recover as in to see state changes from yellow to green. It didn't show any new data that was sent from various component during this 12+ hours.

We didn't touch other nodes or logstasth they were up and running all the time. We invoked all the API calls using LB URL not local URL.

DId we do anything wrong and if so what can we do to change in our approach. This cluster monitoring utilized by 25+ projects for log analysis So getting huge outage is not possible.

Kindly help.

thanks

s1monw · December 7, 2018, 3:36pm

basic question, did you remove the data from the node when you restarted them?

simon

system · January 4, 2019, 3:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Restarting many nodes Elasticsearch	3	278	July 19, 2018
Elasticsearch rolling restart recovery is slow Elasticsearch	3	1239	January 10, 2020
Why does a restart performs recovery which takes long time (6-12hrs)? Elasticsearch	3	2704	January 23, 2019
Optimization for rolling restart without stopping indexing Elasticsearch	1	396	April 2, 2021
Rolling Restart -- Local Replicas do not reuse local data (2.4.2) Elasticsearch	5	966	July 21, 2017

High recovering time during rolling restart of Elasticsearch 6.2

Related topics