Any reason not to use ALL data nodes as coordinator nodes for queries/indexing?

We have 37 data nodes but our clients only use 6 of them to send requests to.

Is there any reason why the entire set of nodes couldn't be used by the clients? I know that some clusters use dedicated coordinator nodes, but I don't see a justification for that in our system given the sheer number of nodes we have since each can serve as a coordinator AND our generally low CPU usage. Is there any reason we couldn't just say that any one of the 37 nodes is a valid coordinator node to use for indexing/querying?

By using all data nodes as coordinator nodes, I see it as reducing the coordination load by 83% on each of the 6 nodes that do it now, while adding just a marginal overhead to each of the remaining nodes (if these 6 nodes can do it and still handle their data responsibilities, then the others should be able to handle 1/6 of what the current coordinators are doing). It also seems like it's more resilient. Right now if one of the 6 goes down we've lost 16.6% of our coordination ability, vs 2.7% if one of the 37 goes down.

Seems like a no brainer, which means I must be missing something.

Hi @Mike_Snare

The devil could be in the details ... :slight_smile:

But no, from the top level, I don't think you are missing anything, assuming some homogeneity.

Many customers put a load balancer in front of the data nodes to distribute the load across the node pool, just as you describe.

We see what you describe when a user starts with "6" nodes and then grows but never re-addresses

I do not see an obvious flaw with your thinking at this point.

If you have dedicated master nodes, you generally want to leave these out and not send requests to them. If you have a tiered architecture, e.g. hot-warm-cold, it may also make sense to direct requests only to certain nodes. If the cluster however is homogenous it is generally better to distribute the load as evenly as possible.

@Mike_Snare
Do you mind sharing the way you found out that only 6 data nodes are used as coordinator nodes?

The cluster is homogeneous for now, so any data node can coordinate as well as any other.

Just a matter of looking at our clients and how they specify the nodes to connect to.

I see. Does your clients know the nodes where their data resides for a search request? Do they send search request with node_ids set in preference field?