I have two logstash nodes running with 8 cores each. However, they consistently crash (every 1 or 2 days) with "Too many files open". I've only been able to find cases of this message on ES, has anyone encountered this on logstash?
-bash-4.2# ulimit -n
8192
I worry that this has to do with the performance of my grok filters. Any help or input on how to proceed to investigate ore remedy would be appreciated.
maybe you need more logstash resources (CPU, open files descriptors, memory, NIC bandwidth...) f.ex. from more nodes running your filtering.
Try to track and verify that your logstash uses that many fds (but most do as the error shows :), so try to raise the ulimit for this process/user/system (see this link)or add more resources ie. more processes, if you got cpu and memory maybe more processes on same node listen on different input ports or add a MQ in front of your logstash nodes...
I have ran logstash with this in /etc/sysconfig/logstash
LS_OPEN_FILES=65535
However, logstash eventually crashed (after 3-4 days) with the same "Too many files open" message in logstash.err. So it seems that increasing the open files limit only prolong the inevitable.
Everything latest (Logstash 2.1.0, ES 2, etc). I noticed Logstash was crashing after a while even after changing my ulimit from 1024 to 64,000. I was thinking about changing it to unlimited but this look to be a bug so I'm glad I didn't.
When I did an lsof I saw tens of thousands of these:
It allows logstash to "sniff" the cluster and discover other ES nodes that it can potentially use to forward events to. By disabling it, you are telling logstash to only use the ES nodes specified in the hosts field. It's probably not all that useful unless your ES cluster is super active in term of horizontal scaling (i.e. you add and remove ES nodes a lot).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.