Stats endpoint for checking a single node health

joanvila · May 24, 2017, 9:52am

I'm facing an elasticsearch cluster with 6 nodes and a load balancer that sends requests to those nodes. The load balancer needs to query the nodes periodically in order to check their healtcheck (if they are alive ready to serve requests). For that purpose, I was using the _nodes/stats endpoint with a timeout of 2 seconds.

With this configuration, it happened to me that when a node went down, the load balancer marked all the nodes as unhealthy instead of only the one that was failing. This made me think that the _nodes/stats endpoint internally, does a query to all the nodes in the cluster and since one of them was not responding, every healthcheck was failing.

Looking at the docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) I saw the _nodes/_local/stats endpoint which requests should be resolved on the node that gets the request without asking anything to the other nodes.

My question is... The assumption I did about thinking that the _nodes/stats endpoint was querying all the nodes internally is correct? In that case, changing the endpoint to the _local one should solve the issue.

Then, is this the proper way to get the health of a single node of elasticsearch? I only want to know if this node is able to reply requests.

warkolm · May 24, 2017, 9:58am

Yes it does. Try _nodes/_local as you mention - Cluster APIs | Elasticsearch Reference [5.4] | Elastic

Otherwise use the high level IP:9200, if a node replies then it's ok, if not then something is up/

pfreixes · June 2, 2017, 2:17pm

Yeps, We changed the strategy to the URL proposed by @warkolm __nodes/local/stats/http and we had the same issue, one node went down for an unknown reason and all of the other nodes couldnt reply to the load balancer during a short period of time - btw 10 and 20 seconds.

However, other regular requests like the search one continued working like a charm.

could the URL __nodes/local/stats/http has some internal communications that involves all of the nodes in the cluster?

What do you recommend, check to another network level such as TCP ? is it enough trustable to infer that a node is up and healthy?

system · June 30, 2017, 2:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Best way to monitor a single node? Elasticsearch	1	620	September 18, 2017
Cluster health endpoint responses Elasticsearch	1	266	September 16, 2019
Best URL for load balancer HTTP health check Elasticsearch	2	5823	July 6, 2017
Does the javascript elasticsearch client have health checks? Elasticsearch	3	701	July 24, 2020
Elastic Load Balancer Elasticsearch	5	849	June 5, 2024

Stats endpoint for checking a single node health

Related topics