Simulation of Master node Failure- Data Loss


(Frazzle) #1

Hi,

I have been implementing an ELK setup for out logging platform and as part of this I decided to do some DR testing. I terminated our master node and it rebuilt automatically, however when it rebuilt it rejoined the cluster as the master node fine, however did not recover any indexes.

I have read a fair bit online and not really found any clear answer here.

Is it possible to have 1 master node, to lose this and to recover all data? If so, how can it be implemented?

The configs are pretty basic, just stating data/master only and also the ec2 discovery settings .

Thanks!


(Mark Walkom) #2

This should be handled with https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway-local.html#_dangling_indices

What version are you on?


(Frazzle) #3

Hi Mark,

I've been able to resimulate a master failure and I had a similar outcome. I had the danlging indices setting set to yes on the nodes and when the master rebuilt, i got:

curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size

What could i have missed

Running on the most recent version(1.5.2?)

Thanks!


(Mark Walkom) #4

Can you elaborate on that, what do you mean by rebuilt exactly?


(Frazzle) #5

Hi,

We terminated the master EC2 instance and another was rebuilt with the same configuration .YML file in place, it took over as the master node however did not recognise the indices on the node.

Thanks.


(Mark Walkom) #6

Well unless you attached the EBS volume/copied the ES data over to the new EBS volume, then the data on that old node is gone.


(system) #7