Data vanished twice due to unknown reason (likely due to disk usage)

  • We have a 8 node ES cluster (2.3.3.) with 52 GB RAM, 8 Cores and 1 TB Azure Premium Disk on Ubuntu 16.04.

  • Everything was working fine, but all of a sudden, we observed watermark warning in logs and all indices got dropped from the cluster.

  • It happend yesterday once and then after restoring data from a snapshot, it happened again a couple of hours before.

  • We are in a dark as we don't fully know what caused it. Any help, pointers, debugging tips will help. It could be an Azure issue or perhaps an ES bug. We are sure that none of our clients are issueing a system wide index drop command.

ES stores anything that is sent to it, so if there is an increase it will just use what disk is there. This is not a bug.

Are you not measuring disk use? What about cluster access?