Data loss after servers hosting the Primary shard and Replica shard were rebooted at the same time


(Lee Chuen Ooi) #1

Hi,
I am using ElasticSearch version 1.5.0, running on Windows OS.

ElasticSearch cluster has 8 shards with 1 replica, hosted on 6 data nodes
+3 master nodes.

2 of the data nodes were rebooted at the same time.
One of them was hosting 6P (primary of the 6th shard).
The other one was hosting 6R (replica of the 6th shard).

After these 2 data nodes joined back the cluster, 6P and 6R are in
"UNASSIGNED" status.
I check the data folder of all data nodes, none of them have the index
folder of the 6th shard.
//0/indices//6 does not exist.

Question

  1. Is this a bug? Why would the folder of the shard-6 totally disappear?
    Isn't that the shard-6 folder should still exist?
  2. Any way to fix it? I think since the index folder is totally gone, there
    is no way to recover the data.
  3. This is going to be a disaster if it happens in production when servers
    hosting the same shard of Primary and replica go down together. How can we
    prevent data loss happen?

Thanks.

Lee Chuen

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ce9531ad-42eb-4b68-a5c4-81df9395b5b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2