I recently started some cluster testing with ES:
- 3 nodes for ES (2.2)
- all nodes are master- and data nodes
- client: logstash, with ES output plugin
- Continuous, significant load via logstash
- shut down master node (Ctrl C)
-> new master node is elected
- start node previously shut down again
- when status of ES cluster is still yellow:
shutdown newly selected master node
With this test scenario, I can easily create data loss.
I've read through quite some documents, result:
In status "yellow", data loss is "acceptable" when problems occur.
Thus, this situation is treated as "OK" from ES point of view.
However, from a users point of view, it is not:
An administrator doesn't know if data loss occurred or not: Important error log information could have been lost.
Basically, I would expect that I can determine if data loss occurred.
Any proposal how we could deal with this situation?
I think increasing number of nodes/replicas will improve the situation.
But still, it could occur.
Any best practice out there?
Many thanks in advance for your help.