Suggetstions for appropriate number of nodes

Hi,

I want to set up my elastic cluster, I have 10 (not for data node) machines, 2 exclusive machines for data node requirements.

I was thinking, I will use the 10 machines in the following way

  1. 4 machines running both Master + Ingestor
  2. 4 Machines running both Kibana + Coordinator
  3. 2 Machines exclusively for APM server
  4. 2 Machines exclusively for data node (cannot allocate extra here as I don't have machines with good disk space)

Is this is a good setup, can you please suggest?

Sizing will depend on the details of your environment. Generally I would recommend that you experiment with real workloads (or a simulation thereof), using an approach such as described at https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

Typically you would have three master-eligible nodes for Elasticsearch, as described at https://www.elastic.co/blog/a-new-era-for-cluster-coordination-in-elasticsearch:

Typically we recommend that clusters have three master-eligible nodes so that if one of the nodes fails then the other two can still safely form a quorum and make progress. If a cluster has fewer than three master-eligible nodes, then it cannot safely tolerate the loss of any of them. Conversely if a cluster has many more than three master-eligible nodes, then elections and cluster state updates can take longer.

The rest depends on:

  • how capable the hardware is
  • how much data you're sending to APM Server and Elasticsearch (how many events/sec)
  • how many end-users will be accessing Kibana
  • desired fault tolerance

Side note: if you only have 2 data nodes then that doesn't leave a lot of room for failure. If you haven't already, consider also using snapshot and restore, for backing up data to slower network/cloud storage.

1 Like

We will conisder these points, I guess what I was looking for is , what different service I can co host in a machine (lets say my machine is powerful enough to host 2 services given our load)

Excluding data nodes, I was thinking of co-hosting

Master + Ingestor
APM + Kibana
Coordinator as separte

Will it cause any problems or do you see any red flags?

OK, I see.

If you have the hardware available, then I would say it's generally a good idea to have dedicated master-eligible nodes. Under https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#dedicated-master-node there's some discussion of why it may be a good idea to run dedicated master-eligible nodes.

For Kibana, the simplest way to load-balance over multiple Elasticsearch nodes is by running a co-located coordinating-only Elasticsearch node, as described at https://www.elastic.co/guide/en/kibana/current/production.html#load-balancing-es

APM Server and Ingest node are both CPU-heavy. The more you can allocate to them the better, so it may be better to run them on dedicated machines.

So (without knowing the finer details of your environment) I'd probably go with something more like:

  • 3 master-eligible nodes
  • 2-3 dedicated Ingest nodes
  • 2-3 Kibana servers with local coordinating-only Elasticsearch nodes for load-balancing
  • 2 dedicated APM Servers

If you intend to run ML or Transforms, you might also run those on the Ingest nodes.

1 Like

This helps thank you.

One clarification , everywhere it is suggested that have 50% of system memory allocated to JVM for elastic search. Any specific reason for this number 50% , any issue if we go lets says 70% mem.

It is important to leave enough memory for things other than the JVM heap, as described at https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html#heap-size

I don't know where the 50% number comes from. I recommend opening a new topic at https://discuss.elastic.co/c/elastic-stack/81 if you would like more details on running Elasticsearch.

1 Like

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.