The Elasticsearch documentation recommends using multi-core processors for Elasticsearch nodes but doesn't recommend the number of CPU cores required for peak performance of indexing and searches. Based on the answer to the (question on maxing out CPUs in Elasticsearch)[https://stackoverflow.com/questions/33611302/how-to-max-out-cpu-cores-on-elasticsearch-cluster] during a search on elasticsearch, each shard of an index would utilize one CPU thread during searches.
If I were to deploy Elasticsearch with the following architecture:
3 Master only nodes node.master: true node.data:false
3 or more Data nodes with number_of_replicas = 2 and number_of_shards = 8 for each index hosted on these nodes node.master: false node.data:true
then,
Does the Search Load-balancer nodes need 1 CPU per shard like the Data nodes to process the searches on the indexes or can it process the searches efficiently with a lower CPU count of 2 or 4?
Does the Master node need 1 CPU per shard to process the ingest traffic or would this be an inefficient design?
If you have master only nodes, don't use them for anything else. Use your coordinating nodes instead.
Does this imply that setting the master node with 2 CPUs should be sufficient?
I assume the co-ordinating nodes are the nodes I referred to as "search load-balancers" or are they data nodes?
I am trying to keep the cost of the cluster low and would like to add as many CPUs as I can afford on the data nodes (with around 32Gb of memory allocated) while keeping the search load-balancer and the master only nodes lightweight. Are there any disadvantages in this approach?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.