How many CPUs should be allocated to an Elasticsearch node that doesn't store data?

I raised this question on stackoverflow. Would you be able to help answer this?

The Elasticsearch documentation recommends using multi-core processors for Elasticsearch nodes but doesn't recommend the number of CPU cores required for peak performance of indexing and searches. Based on the answer to the (question on maxing out CPUs in Elasticsearch)[https://stackoverflow.com/questions/33611302/how-to-max-out-cpu-cores-on-elasticsearch-cluster] during a search on elasticsearch, each shard of an index would utilize one CPU thread during searches.

If I were to deploy Elasticsearch with the following architecture:

  • 3 Master only nodes
    node.master: true
    node.data:false
  • 3 Search load-balancer nodes
    node.master: false
    node.data:false
  • 3 or more Data nodes with number_of_replicas = 2 and number_of_shards = 8 for each index hosted on these nodes
    node.master: false
    node.data:true

then,

  1. Does the Search Load-balancer nodes need 1 CPU per shard like the Data nodes to process the searches on the indexes or can it process the searches efficiently with a lower CPU count of 2 or 4?
  2. Does the Master node need 1 CPU per shard to process the ingest traffic or would this be an inefficient design?

Because that depends on a number of variables.

  1. It'll work fine with 2-4
  2. If you have master only nodes, don't use them for anything else. Use your coordinating nodes instead.

If you have master only nodes, don't use them for anything else. Use your coordinating nodes instead.

  • Does this imply that setting the master node with 2 CPUs should be sufficient?
  • I assume the co-ordinating nodes are the nodes I referred to as "search load-balancers" or are they data nodes?

I am trying to keep the cost of the cluster low and would like to add as many CPUs as I can afford on the data nodes (with around 32Gb of memory allocated) while keeping the search load-balancer and the master only nodes lightweight. Are there any disadvantages in this approach?

Cluster state operations are single threaded to ensure consistency, so one for that and one for the OS is ++

Yes, the "official" name for that is coordinating nodes :wink:

Nope, totally sane idea.

1 Like

Awesome! Thanks! :bowing_man:

Mind if I link this thread as the solution to the stackoverflow question?

1 Like

Go for it :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.