I'm new to this forum, and rather new to Elasticsearch. I've been playing around a little bit with the ELK-stack on virtual machines. Now however, I would like to know what it takes to get it in a production environment (hardware wise).
I receive around 300 MB of log files with peaks of 1 GB a day. From around 250 servers. Any suggestions about memory, how many cores, nodes and maybe other tips? And why I should use that amount of memory or cores etc.
The amount of data you have seems to be a small one with decent peaks. What i would recommend you is, go for a cluster system with 3 nodes for ES , 2 for Logstash and 2 for web tiers. The idea behind web tiers is to protect your ES behind a reverse proxy and same for 3 node ES cluster is to have a master-slave replica. For the hardware wise use the following:
2 CPU , 8 GB RAM, 500 GB ---> For ES
1 CPU , 2 GB RAM 100 GB -----> For Web tier
2 CPU, 4 GB RAM 100 GB -------> For Logstash
I hope i answered your need. Let us know if you want more info, on it.
First of all, thanks allot for your help. Although i have another question, why do you think I should use 8 GB instead of 16, 32 or like Elasticsearch recomends, 64 GB of memory?
@Arvid_de_Jong The reason i recommended the memory to 8GB was due to amount of data you receive. Obviously there is no harm is using additional memory, but to make it cost effective i recommend that. If you are in a cloud environment like AWS you can increase the memory and CPU's on fly. You can always give it a shot to a lesser memory and experience the latency and then increase it there after. Hope i was clear.
The reason i suggested 100GB is for web tier is because of the massive log generation by apache/nginx and in case you want to store previous logs considering it as a production server. It can definitely be down sized and i agree to it. I made it this way because that would remove any disk space usage.
@luuk Kind of i recommend the same but we can downsize the logstash to one and reducing a node from ES too. But anyways it depends on what infrastructure you have and what you are trying to achieve with the ELK stack.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.