DavidTurner helping once again, suggests that /var/lib/elasticsearch contents can be cleared without worries since any relevant data will be stored on other nodes of the cluster. I did this and re-ran my configuration management agent and the node was able to re-join the cluster.
Is there anything else I should look out for while I'm here? Otherwise I'm guessing this post will just stale out.
I'm also wondering if there's any way I can discover the root cause of this file going missing? I did say I cleared out the contents, but actually I ran mv /var/lib/elasticsearch/nodes /var/lib/elasticsearch/nodes.old, so I should be able to analyze any of the files here, but I'm not immediately aware of any tools that would let me discover the root cause. I'm going to check with my backups guy to see if he has any record of the file, but assuming these iterate in alphabetical order, then I could probably assume that it hadn't been formed yet, as there were no o's, but several p's, n's, and m's.
Anything that could help me discover the root cause would be really helpful, as I should be able to build some monitoring or CM to properly care for the directory contents. Thanks!
The two likely explanations are (a) something other than Elasticsearch removed this file or (b) you had a power outage while Elasticsearch was writing the node state and your storage system performed some operations in the wrong order just before the outage. In either case you should be worried.
The solution is not to try and monitor the contents of the data path: it's best to consider it as being entirely under Elasticsearch's control. But it's definitely worth getting to the bottom of this if you can.
Yes I'm wondering if the operation I ran to have the CM execute a POST to create a new user had somehow interrupted cluster operations, but it just seems so unlikely. I'm glad everything seems to be working though, I can breathe a bit easier going into the weekend.