I have Elasticsearch cluster with two nodes: a two-master setup. It is used to store logs.
Suddenly one of the masters went down, and refuses to go up. The logs reveal the following error:
Suppressed: java.io.IOException: No space left on device
I tried listing index names (and maybe deleting some indexes?) and interacting with the node that is still live alive, e.g. second master. But it refuses to respond, with the following error:
{
"error": {
"root_cause": [
{
"type": "master_not_discovered_exception",
"reason": null
}
],
"type": "master_not_discovered_exception",
"reason": null
},
"status": 503
}
I'm perfectly comfortable with dropping some older data, e.g. some older indexes. I have access to the disk, and can try & wipe some older data from the /var/lib/elasticsearch
folder (if I'm not mistaken), but I'm not sure how exactly elasticsearch organizes the data. It's very likely that I will corrupt the data.
What are the next steps to mitigate this?
Is there an offline CLI tool that'd be able to remove some older indexes? If such tool existed, I could use it to wipe older indexes and restart must Elasticsearch cluster.
As a last resort, I of course can stop the Elasticsearch, increase the disks, and run Elasticsearch again.