I have been parsing logs using logstash's grok parser log by log, and recently one of my pipelines.yml files which contains 100+ input files has been crashing my elastic server (out of memory).
After reading this, my guess is Elastic was previously crashing because something wasn't getting cleared out of Logstash's cache while logstash was parsing (~25GB worth of logs) and my system was running out of memory. I have not been running any Kibana queries at all - only parsing with Logstash. Could this possibly be what was making Elastic crash?
Because if not, my only other guess would be that my system simply allocated too little memory to Elastic, which seems like a more expensive problem!
At the point of my testing (before I started reading up about memory allocation in Elastic), I was using the default: 1GB heap and only 4 indices. Amount of documents inside before Elastic crashed was about 300 million documents (~40GB).
What I've done after that was to set:
indices.fielddata.cache.size: 20%
Modify heap size to 8GB
I'm sure one or both of these will keep my server alive for longer, but I just wanted to get to the bottom of what exactly was causing the crash.
P.S. eventually, I'll probably have about 1TB worth of data inside Elastic and somewhere from 8GB-32GB RAM for Elastic.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.