ES 6.2.4 How to find out which nodes are offline

Hi Team,

What is the recommended way to find out which nodes in the cluster are offline using the rest API?

In Kibana monitoring I can see which nodes are offline but I am not sure how Kibana actually determines that - does it do a comparison between a previously known list of nodes?

Cheers,

Hi,

You can find the number of active nodes by querying GET /_nodes/stats in the Kibana console or from the following rest end point - http://ip:9200/_nodes/stats

Thanks, my impression was GET _nodes/stats only shows nodes that are up and not those that are down? Also, from the JSON returned, do I have to look at the processs.timestamp key and figure out when it was up last - how do I determine that the node is down from the JSON?

That's true.
I believe that Monitoring stores the fact that a node has been seen then if the node does not report statistics after a while to the monitoring cluster then its considered as down.

I did not check the code though.

1 Like

I ran into the same problem a few years ago: There is nothing in GET _nodes/stats telling you if a node has dropped out of the cluster. And Kibana monitoring will only show the node (in gray) for the duration of the selected Time Range (1 hour by default), after that it will be gone from Kibana surveillance too, until it rejoins the cluster.

My solution was to create a simple cluster deployment configuration, which I found useful for other things too (such as full cluster restarts), in which I list all the nodes that should be in a particular cluster. Then a monitoring script uses this configuration when checking _nodes/stats and reports if a node is missing. This requires a bit of scripting but is a straight forward job for a programmer.

1 Like

Thanks David and Bernt!

We have the same approach and have a config file as well that lists out all the nodes that are supposed to be running (with their settings/attributes) which I was planning to use to do the comparison with GET _cat/nodes (which also only shows members that are up).

Just wanted to check that I'm not going the more complicated way before I implement the above - thanks guys for giving me the confidence :smile: .

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.