Super Helpful Feature Request regarding Elasticsearch node data

I know as a sysadmin, it should be my responsibility to have a list somewhere that contains all the servers / server names that belong to a specific cluster. Unfortunately, sometimes when one is in a rush to meet a deadline, they will forget to do something like this and here's where the problem starts:

When a new cluster is created and new nodes are added to the cluster, we have several options to query the node list to see which servers are currently online within the cluster. However, if nodes drop out of the cluster for whatever reason and if the sysadmin neglected to keep a node list updated, there doesn't appear to be any functionality within Elasticsearch to show what nodes are missing.

Now, I understand that one of the major features of Elasticsearch is the ability to scale out horizontally and to add / remove nodes as needed, but it would be SUPER helpful if Elasticsearch kept a log file somewhere that simply showed which nodes have been added / removed from a cluster.

This would allow someone to very quickly determine which servers are missing from the cluster. Seeing "UNASSIGNED" is a shot in the gut because we would know that some nodes are missing, but we might not have recorded the names of those nodes as the cluster grows.

What I am proposing is some very simple log file that would show the node names that joined a cluster at some point in time so that if some nodes disappeared, we would at least have some ability to determine what the name of those nodes were/are.

This would be extremely helpful when you have several dozen nodes and one knew there was supposed to be 42 nodes but only 39 are showing under /_cat/nodes.

If we had some simple log file that just recorded when new nodes were added (and removed), we could very quickly scan the file to locate the missing node names.

Thank you!!

There is already something like this.

When a node joins or left the cluster, the master node will have this information in the log.

It is also recommended to monitor the cluster using metricbeat, or at least any other monitoring tool.

Also, having a log file with just the nodes that join and left the cluster wouldn't solve the problem, what if the log file gets too big and is deleted by a logrotate policy? You would end up in the same situation.

The way to solve a problem like this is to monitor the cluster and have the monitoring included in the provision of new nodes.

2 Likes

Thank you leandrojmp! I will check the master node logs. That's very helpful info!