Load balancer ( like F5) vs Coordinating node

Which is better and why ?

Edit

Small 3 node cluster all master,data nodes. Each node 8 cores, 32 GB RAM, 8 GB Heap
Purpose of cluster - search
Request Rate 150/s

Would adding a coordinating decrease the latency and throughput ?

hello this is an interesting question or it could be if you gave more info on your need and setup :wink:

IMHO the main advantage of the coordinating node is that it's aware of the topology so it can route requests directly to the right shards.

Also it does the aggregation/collection/call it as you wish of data gathered from all shards.

It's not a naive proxy.

Added more detail, I am in general looking for a generic guidelines.

Are there any figures as to what sort improvement can be observed with a coordinating node compared to a load balancer ?

No. I can just tell that with coordinating nodes you will gain one network hop.

Doesn't the coordinating node take over the responsibility of collecting the responses, sorting them and sending them as JSON? if so does this have any impact on performance?

A coordinating node is the node which gets a request from the outside. It can be dedicated or any node in the cluster. So what you describe will happen anyway somewhere in the cluster.

Dedicating a node then allows giving more resources to it.

I also have a similar cluster and using transport client , if i do not have any coordinating nodes, does one of the nodes act as coordinating node?
All my 3 nodes are master/data nodes.

Yes.

A coordinating node is the node which gets a request from the outside

That is a very interesting discussion!

I have a few more question in the same topic:

  1. Let's say I have a web application which is only aware of one data node out of 3 data nodes available in the cluster. That data node, which becomes a coordinating node, surly communicate with all other nodes. But does it mean that all my requests go to its search thread pool? It could explain why sometimes I'm getting a hang in my searches, and then suddenly get all the previous search responses at once including the last.

  2. Concerning to indexing purposes. Let's say again that my indexing application is only aware of 1 data node. Yet again, I believe this data node knows how to divide the data between all the other nodes and shards. But yet again, its indexing queue will become a battle neck?

  3. By the way, when using Nest client - is there a way to make the searching/indexing operations in round robin through all the nodes? Update: here is an excellent explanation for what I was looking for. Apparently, I wasn't doing a round robin requests - since Nest was given only 1 node address and no sniffing was used. Will making search request in round robin improve performance?

I am hoping someone from elastic will answer you queries. The term coordinating node has become sort of a black box as lot of it's behavior is still unknown.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.