Coordinating nodes in a cluster

We don’t currently use a balancing node / coordinating nodes, and I have seen that sometimes if a particular node is under load, it slows down the entire server. This is very mysterious to me, as the indices that need to be searched on are not even on the node that is under load. Would implementing coordinating nodes deal with this issue?

Context: we are experimenting with adding warm nodes to the cluster, as having all of the nodes on performant hardware is starting to become a financial burden. We are tweaking configurations of the warm nodes to find a good sweet spot, but at times we can see that if the warm nodes don’t have enough resources, the entire cluster suffer — even though the search requests are not searching indices located inside the warm nodes. This leads me to believe that the search requests got routed to the warm nodes even if the indices are not there. So I am wondering how best to address this, and wonder if coordinating nodes is what we need in the cluster.

That will only happen if the client sending the request routes it there. Elasticsearch won't send a request from one node to another if it doesn't have the relevant data to query.

What version are you on? Do you have Monitoring running?

We do have monitoring enabled. I am running version 6.6.2. I do fully intend to upgrade to the latest version but so far it is not an easy matter to reindex all the docs from a named doc type to _doc.

Previously, the depreciation log gave me a warning but it seems that the latest depreciation path no longer gives that. Our indices only have a single doc type so it doesn’t matter what they are called. Is that why?

Adding coordinating-only nodes should help, but you could also make sure to only expose the hot nodes to the clients.

Thank you. I will definitely try to go that route.

What do you mean by only exposing the hot nodes to the clients? Doesn’t the hot nodes automatically announces themselves?

For example, let’s say both the hot and warm nodes contain an index called twitter-* in the format twitter-2020.06, with the older series in the warm nodes and the newer tweets stored in the hot nodes. If say I perform a search against twitter-*, does it not auto search against all the nodes in the cluster with that index? If instead I use firewall to prevent the warm nodes from being seen by the client, would I only be searching the hot nodes or will it give an error?

All nodes will be searched, but the hot nodes will be coordinating the requests which puts less pressure on the warm nodes.

I see. I actually added a coordinating node right now. Just to be clear:

  • When I am sending requests from the client (we are using the Python client), during initiation to sniff for nodes, I should just provide the coordinating node hostname? Because I’ve noticed that it doesn’t seem to matter what we provide to the node. ES takes that initial node as the starting node and will connect to all of the nodes regardless. And the strange thing is that let’s say a node no longer has any shards (we do this before we shut down nodes), the client will still complain that he no longer can access it.

If you use sniffing it will detect and use all the nodes.

So when you suggest that I don’t use the warm nodes, you are suggesting that I don’t use sniffing at all? Doesn’t the client sniff by default? I thought that sniffing is there so that when a node is down, it automatically remove it from the client to connect to it directly (so perhaps those exceptions are in fact just a warning messages?)

You generally provide a list of nodes to the client and these allow for fallback in cast a node goes down. I believe sniffing settings, e.g. in Logstash, makes the client query the cluster for additional nodes to connect to in addition to the ones provided initially.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.