Hi everybody,
I've been running a 2-node elasticsearch cluster for the last couple of months with a total of around 1TB of indexed data.
Today, during heavy load, both clusters became unresponsive. I brought Node 1 down with a kill (perhaps not a good thing to do in hindsight)
Node 2 is not responding to kill; although the server no longer listens on 9200, the process is hanging on even though no disk I/O appears to be taking place.
I can bring Node 1 back up, so that 9200 responds again with the usual message. However, when checking for cluster health, I get the following error:
http://192.168.1.69:9200/_cluster/health
=>
{"error":"MasterNotDiscoveredException[waited for [30s]]","status":503}
and test search gives me
http://192.168.1.69:9200/_search?pretty=1
=>
{
"error" : "ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];[SERVICE_UNAVAILABLE/2/no master];]",
"status" : 503
}
Is there anything I can do to recover my data or have I messed this up for good? At the moment, neither node responds to shutdown via Shutdown API either.
Thanks for your help!