Master and client node role clarifications

I am quite new to elasticsearch and i have read the documentation and the online book. However , some things are still not clear to me and unfortunately i haven't found the exact answers i am looking for.

Lets assume that we have the classic cluster setup of collaborator- master - data node.

I know that the master role is to manage the cluster state. That involves adding-removing cluster nodes and creating indexes. So, when you create an index this request must be processed by the master. That means if a request for an index creation is made to the data node, then the data node will forward the request to the master in order to handle it. The master node might in turn "send-back" the request to the data node in order to create the index; if it is determined that the index must be stored in the same data node where the initial request was made.

  1. Does a client node (load balancer) will determine that since this is a index creation job it should be forwarded straight to the current master node?

  2. If above question is true, does this mean that a client node will help you having less "back-and-forth" requests within the cluster?

I setup locally on my laptop this client-master-data node cluster using three different instances of elastic search. I created an index using curl and by querying each node. That is, i sent an index create request to the master node, another to the client node and one more to the data node. All three indexes were successfully created.

  1. As far as i understood, if you dont send ALL your request to the client node (collaborator) you are actually neglecting the specific cluster setup. Just having a client-master-data node setup doesn't mean it serves its purpose if you send your requests to any node you wish. In order for this setup to work you MUST send all your requests to client node. Is that correct?

  2. So the client node is responsible for sending the requests to the appropriate node. Requests for cluster changes and index creation go to master nodes, and searches to data node. If the client node knows where each data is stored , or in other words, when it receives a search term, it knows which nodes (shards) to ask to , why this information must be stored on all cluster nodes? Why data nodes must know where each information is stored since the whole searching job is done by the client node?

  3. The way i see it, apart from the actual data located on the data nodes, any other information is located on all types of nodes within the cluster. Its just that you isolate/assign specific jobs to specific nodes despite the fact these jobs could be performed by all nodes. Is that correct?

  4. Is there any information located on one node type but not to another. For example, does the master node contain information that a client node does not have (and vice versa )?

  5. Am i completely confused and don't know what i am talking about?

Thanks

1 Like

The default configuration for a node is that it is master eligible and hold data. Just because you can use dedicated node types doesn't mean that you should. For small clusters, e.g. 3-5 nodes, it often makes sense to let all nodes hold data and be master eligible. You can then send any request to any node, and as all nodes have a copy of the cluster state, they know how to rout the requests.

It is generally only for larger clusters that it starts making sense to start separating out dedicated master nodes. Client nodes are not necessarily required at all for many use cases.

@Christian_Dahlqvist What use cases justify having a dedicated client node?

Dedicated coordinating nodes, aka client nodes, take away the request parsing and final stages of aggregating/querying from the data nodes and allows them to concentrate on handling the data. To what extent this benefits a cluster will vary from case to case. Generally I would say their use is more common in query heavy use cases where cashing on the data nodes is important.

Got it.. thank you.

Still , could someone answer my questions in the initial post?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.