I'm facing an elasticsearch cluster with 6 nodes and a load balancer that sends requests to those nodes. The load balancer needs to query the nodes periodically in order to check their healtcheck (if they are alive ready to serve requests). For that purpose, I was using the _nodes/stats
endpoint with a timeout of 2 seconds.
With this configuration, it happened to me that when a node went down, the load balancer marked all the nodes as unhealthy instead of only the one that was failing. This made me think that the _nodes/stats
endpoint internally, does a query to all the nodes in the cluster and since one of them was not responding, every healthcheck was failing.
Looking at the docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) I saw the _nodes/_local/stats
endpoint which requests should be resolved on the node that gets the request without asking anything to the other nodes.
My question is... The assumption I did about thinking that the _nodes/stats
endpoint was querying all the nodes internally is correct? In that case, changing the endpoint to the _local one should solve the issue.
Then, is this the proper way to get the health of a single node of elasticsearch? I only want to know if this node is able to reply requests.