Node types in an ElasticSearch cluster


#1

Hello everyone,

I am reading this article and it seems we have various types of nodes, master nodes, client nodes, aggregation query result nodes, data nodes, etc.

Wondering if there is an introduction for all kinds of nodes and best practice to plan them in a cluster?

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html

thanks in advance,
Lin


(Mark Walkom) #2

I think there is only that page, which could probably be expanded on :smile:

However there are only 4 node types - master, data, client and tribe;

  • Master only nodes take place in updating cluster state as well as master elections. They should never handle query or index loads.
  • Data only nodes store data that is indexed into Elasticsearch. These can also handle querying and indexing.
  • Client only nodes are used as load balancers for indexing and searching.
  • Tribe nodes are akin to cross-cluster client nodes, in that they can query more than one cluster. Think federated search.

Regarding placement, we generally recommend separating roles out for larger clusters, >10 nodes, but that's arbitrary and very use case dependant.


#3

Thanks Mark,

Client node is where search results are aggregated from data nodes, and returned finally to customer?

regards,
Lin


(Mark Walkom) #4

Yes, although a data or a master node can also do this.


#5

Thanks Mark,

Is there a configuration samples for all kinds of nodes in a cluster? I currently only have a single node cluster and want to learn by sample how to configure a multi-node cluster.

regards,
Lin


(Mark Walkom) #6

Not really, it's just a combination of node.master: and node.data: being set to true or false.

If you set node.client: true it implicitly sets node.master and node.data to false.


#7

Thanks Mark,

What is the benefit of having dedicated client node other than using master/data node for search function?

regards,
Lin


(Mark Walkom) #8

You can potentially run a massive query against a master or data node that can cause OOM, which is bad.

If you use a client node and it OOM's then you have no worries about impacting your data or your cluster state.


#9

Making sense, thanks Mark.

Any other documents on best practices of how to configure (e.g. # of each nodes, etc.) multi-role nodes cluster?

regards,
Lin


(Mark Walkom) #10

No, the rest is based on your needs and use case.


#11

Thanks Mark,

Wondering how to decide how many nodes for each role do I need?

regards,
Lin


(Magnus Bäck) #12

Wondering how to decide how many nodes for each role do I need?

That's a "how long is a piece of rope" kind of question that depends on too many factors. If you don't have at least ten nodes (give or take) I don't think you should spend time on this.

Keep in mind that if you want to have master-only nodes you should have (at least) three since one master node results in a single point of failure for your cluster and two makes you liable to split brain issues. And the cost of three master-only nodes isn't justifiable for small clusters.


#13

Thank you Magnus.

regards,
Lin


(system) #14