I have four servers each with 64 GB of RAM, 24 Cores, and 4TB SSD Storage. I have been reviewing videos about the best way to deploy Elasticsearch in a cluster environment.
Should I create use three of the servers for the cluster and one server for logstash and kibana?
Also one server for the master node and the other two for data nodes and client nodes?
Logstash and Kibana generally take a smaller footprint compared to elasticsearch. This all really depends on the volume and type of data we are talking about, but logstash would likely only need 2gb JVM heap (allocate ~4gb total system memory) and somewhere between 4 and 8 cores. 2gb for kibana should be sufficient, maybe 2-4 cores.
Given that, you could provision 3 of the servers for elasticsearch with master & data roles. On these servers, you could allocate up to ~31.5gb for the JVM heap, but please read A Heap of Trouble blog to better understand how much to allocate. Don't worry about the remaining / unused memory, elasticsearch will use all of that memory for file system caching. Also, make sure to review the important settings docs.
For the remaining host, use it for logstash and kibana. If you aren't using something like docker to constrain resources, you'll want to make sure to limit the number of pipeline workers for logstash (by default, it will be the number of CPUs). Since you will have plenty of space, you can setup a coordinating only node for elasticsearch. Logstash and Kibana can use this (eg. localhost:9200). This coordinating node would only need 8-12 cores, and 12-16gb JVM heap (this is on the high side of things, but it is better to over estimate...). Elasticsearch will size its threads based upon the number of cpu cores that it can see, but you can adjust that with the processor setting [not necessary if you use docker to partition resources].
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.