The indexing or search request send to down node

I have an Elasticsearch (v5.6.10) cluster with 3 nodes.

  • Node A : Master
  • Node B : Master + Data
  • Node C : Master + Data

There are 6 shards per data node with replication set as 1. All 6 primary nodes are in Node B and all 6 replicas are in Node C.

When I shutdown one node for maintenance I can see indexing or search requests still trying to reach that node and failing. I suspect this is because the client connecting to elastic is configured with all three node IPs.

Is there any way to avoid the requests reaching that down node?

I guess it depends on the client.

You must definitely upgrade everything. The cluster, the client... It's a way too old.

Can you please elaborate on that?

I am trying to find a solution where client does not face errors in this situation.

I saw that cluster status being yellow does not cause any issue during the maintenance period. But if a request goes to the down node, that’s when the problem occurs.

And, yes they are very old. I am planning for the upgrade, but it will take some time. For now I need to keep things running.. :slight_smile:

Which client are you using?

It’s a java client - “elasticsearch-rest-high-level-client” and all the elastic node ips are provided as a list while creating the rest client.

Did you add the sniffer? Sniffer | Elasticsearch Java API Client [8.8] | Elastic

The client contains a connection pool, which should mark connections as down once this has been detected. You could therefore see a few requests target the downed node before this is detected. This assumes you are using tbe client correctlt as a singleton and not creating it for each request.

Yes, I did. I was also looking into that. It seems I missed something.

Just to confirm, sniffer will remove any down node from the active node list and also add it back once it is up?

Yes, it is implemented as a singleton, but the connection to the down node is not automatically withdrawn from the connection pool.

I never used it but I guess it works that way according to the doc :wink:

It is also possible to enable sniffing on failure, meaning that after each failure the nodes list gets updated straightaway rather than at the following ordinary sniffing round. In this case a SniffOnFailureListener needs to be created at first and provided at RestClient creation. Also once the Sniffer is later created, it needs to be associated with that same SniffOnFailureListener instance, which will be notified at each failure and use the Sniffer to perform the additional sniffing round as described.

I hope so too. I will give it a try and update here. Thanks a lot for all the guidance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.