Coordinating nodes in a cluster

smlbiobot · June 13, 2020, 11:26pm

We don’t currently use a balancing node / coordinating nodes, and I have seen that sometimes if a particular node is under load, it slows down the entire server. This is very mysterious to me, as the indices that need to be searched on are not even on the node that is under load. Would implementing coordinating nodes deal with this issue?

Context: we are experimenting with adding warm nodes to the cluster, as having all of the nodes on performant hardware is starting to become a financial burden. We are tweaking configurations of the warm nodes to find a good sweet spot, but at times we can see that if the warm nodes don’t have enough resources, the entire cluster suffer — even though the search requests are not searching indices located inside the warm nodes. This leads me to believe that the search requests got routed to the warm nodes even if the indices are not there. So I am wondering how best to address this, and wonder if coordinating nodes is what we need in the cluster.

warkolm · June 15, 2020, 1:30am

That will only happen if the client sending the request routes it there. Elasticsearch won't send a request from one node to another if it doesn't have the relevant data to query.

What version are you on? Do you have Monitoring running?

smlbiobot · June 21, 2020, 6:10am

We do have monitoring enabled. I am running version 6.6.2. I do fully intend to upgrade to the latest version but so far it is not an easy matter to reindex all the docs from a named doc type to _doc.

Previously, the depreciation log gave me a warning but it seems that the latest depreciation path no longer gives that. Our indices only have a single doc type so it doesn’t matter what they are called. Is that why?

Christian_Dahlqvist · June 21, 2020, 7:47am

Adding coordinating-only nodes should help, but you could also make sure to only expose the hot nodes to the clients.

smlbiobot · June 21, 2020, 8:07am

Thank you. I will definitely try to go that route.

What do you mean by only exposing the hot nodes to the clients? Doesn’t the hot nodes automatically announces themselves?

For example, let’s say both the hot and warm nodes contain an index called twitter-* in the format twitter-2020.06, with the older series in the warm nodes and the newer tweets stored in the hot nodes. If say I perform a search against twitter-*, does it not auto search against all the nodes in the cluster with that index? If instead I use firewall to prevent the warm nodes from being seen by the client, would I only be searching the hot nodes or will it give an error?

Christian_Dahlqvist · June 21, 2020, 9:19am

All nodes will be searched, but the hot nodes will be coordinating the requests which puts less pressure on the warm nodes.

smlbiobot · June 21, 2020, 10:06am

I see. I actually added a coordinating node right now. Just to be clear:

When I am sending requests from the client (we are using the Python client), during initiation to sniff for nodes, I should just provide the coordinating node hostname? Because I’ve noticed that it doesn’t seem to matter what we provide to the node. ES takes that initial node as the starting node and will connect to all of the nodes regardless. And the strange thing is that let’s say a node no longer has any shards (we do this before we shut down nodes), the client will still complain that he no longer can access it.

Christian_Dahlqvist · June 21, 2020, 10:12am

If you use sniffing it will detect and use all the nodes.

smlbiobot · June 21, 2020, 11:37am

So when you suggest that I don’t use the warm nodes, you are suggesting that I don’t use sniffing at all? Doesn’t the client sniff by default? I thought that sniffing is there so that when a node is down, it automatically remove it from the client to connect to it directly (so perhaps those exceptions are in fact just a warning messages?)

Christian_Dahlqvist · June 21, 2020, 12:19pm

You generally provide a list of nodes to the client and these allow for fallback in cast a node goes down. I believe sniffing settings, e.g. in Logstash, makes the client query the cluster for additional nodes to connect to in addition to the ones provided initially.

system · July 19, 2020, 12:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Usage of coordinator node for indexing Elasticsearch	8	14714	March 12, 2020
How is coordinating node decided? Elasticsearch	4	1179	September 26, 2018
Please explain the flow of data? Elasticsearch	4	657	July 6, 2017
Load balancer ( like F5) vs Coordinating node Elasticsearch	12	4039	July 24, 2017
Cluster, 3 severs (with 3 nodes) and I need coordinating node Elasticsearch	10	1374	March 8, 2018

Coordinating nodes in a cluster

Related topics