Logstash crashing with "Too many files open"

vtst2412 · December 6, 2015, 7:41pm

I have two logstash nodes running with 8 cores each. However, they consistently crash (every 1 or 2 days) with "Too many files open". I've only been able to find cases of this message on ES, has anyone encountered this on logstash?

-bash-4.2# ulimit -n
8192

gist.github.com

https://gist.github.com/vttran/4e5725954c4fb467993c

logstash.err

sun/misc/URLClassPath.java:1003:in `getResource': java.lang.InternalError: java.io.FileNotFoundException: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-2.b17.el7_1.x86_64/jre/lib/ext/cldrdata.jar (Too many open files)
        from sun/misc/URLClassPath.java:212:in `getResource'
        from java/net/URLClassLoader.java:365:in `run'
        from java/net/URLClassLoader.java:362:in `run'
        from java/security/AccessController.java:-2:in `doPrivileged'
        from java/net/URLClassLoader.java:361:in `findClass'
        from java/lang/ClassLoader.java:424:in `loadClass'
        from java/lang/ClassLoader.java:411:in `loadClass'
        from sun/misc/Launcher.java:331:in `loadClass'
        from java/lang/ClassLoader.java:357:in `loadClass'

This file has been truncated. show original

gist.github.com

https://gist.github.com/vttran/1f44be6c34cbaed62670

gistfile1.txt

###############################
# Default settings for logstash
###############################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
#LS_HOME=/var/lib/logstash

This file has been truncated. show original

gist.github.com

https://gist.github.com/vttran/4248661c8a6ba3942741

mini.conf

input {
  beats {
    # The port to listen on
    port => 5055
    
    # ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
    # ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"

  }
}

This file has been truncated. show original

I worry that this has to do with the performance of my grok filters. Any help or input on how to proceed to investigate ore remedy would be appreciated.

stefws · December 6, 2015, 8:28pm

try to track number of open FDs:

ls /proc//fd/ | wc -l

maybe you need more logstash resources (CPU, open files descriptors, memory, NIC bandwidth...) f.ex. from more nodes running your filtering.

Try to track and verify that your logstash uses that many fds (but most do as the error shows :), so try to raise the ulimit for this process/user/system (see this link)or add more resources ie. more processes, if you got cpu and memory maybe more processes on same node listen on different input ports or add a MQ in front of your logstash nodes...

vtst2412 · December 6, 2015, 8:34pm

I have ran logstash with this in /etc/sysconfig/logstash

LS_OPEN_FILES=65535

However, logstash eventually crashed (after 3-4 days) with the same "Too many files open" message in logstash.err. So it seems that increasing the open files limit only prolong the inevitable.

Matthew_Prinvale · December 10, 2015, 6:43pm

I'm having the same issues here.

Everything latest (Logstash 2.1.0, ES 2, etc). I noticed Logstash was crashing after a while even after changing my ulimit from 1024 to 64,000. I was thinking about changing it to unlimited but this look to be a bug so I'm glad I didn't.

When I did an lsof I saw tens of thousands of these:

`<beats     3311 3575   logstash  196u     IPv6            6016017      0t0        TCP redacted:33002->redacted:9200 (ESTABLISHED)
LogStash:  3311 3572   logstash   42u     IPv6            5071865      0t0        TCP redacted:32880->redacted:9200 (ESTABLISHED)

it seems like Logstash isn't closing them. Is there a work-around until a fix is in place?

vtst2412 · December 10, 2015, 6:45pm

I fixed it by turning off sniffing (in elasticsearch output). logstash was keeping the tcp sockets open and consuming fd rapidly.

Matthew_Prinvale · December 10, 2015, 6:48pm

nice! I just checked and mine is most certainly enabled (true). Was are the complications for changing this boolean?

vtst2412 · December 10, 2015, 7:12pm

It allows logstash to "sniff" the cluster and discover other ES nodes that it can potentially use to forward events to. By disabling it, you are telling logstash to only use the ES nodes specified in the hosts field. It's probably not all that useful unless your ES cluster is super active in term of horizontal scaling (i.e. you add and remove ES nodes a lot).

Matthew_Prinvale · December 10, 2015, 7:24pm

Great! I'm pointing to a load balancer anyways! Glad I'm working this out in dev! haha. I made the change so fingers crossed

vtst2412 · December 10, 2015, 7:43pm

No need to cross fingers. You can confirm as soon as logstash is restarted with sniffing disabled.

lsof | grep "logstash.*TCP" | wc -l

Run it a few seconds apart (about 5s would do). If the number is not growing rapidly, you are golden.

Topic		Replies	Views
File - failed to open fichier1.log: Unhandled IOException: java.io.IOException: unhandled errno: Too many open files Logstash	1	315	July 19, 2018
Too many open files. Logstash 6.1 Logstash	2	2622	January 25, 2018
Getting error when bulk json files trasfer from logstash to ELK Logstash	1	306	December 2, 2019
ElasticSearch giving FileNotFoundException: (Too many open files) Elasticsearch	2	1510	July 6, 2017
2.4.0: Too Many Open Files Logstash	2	614	December 16, 2016

Logstash crashing with "Too many files open"

Related topics