Logstash builds Recv-Q for minute long pauses

(Gary) #1


We are sending syslog data over the wire using stunnel to encrypt, and when it gets to the logstash server (on Ubuntu 14.04), we periodically build up a big Recv-Q as the data moves out of the server-side stunnel, and into Logstash's TCP 5000 port.

During this time, the JVM is not logging any garbage collection. In fact, "jstat -gccause" shows numbers remaining static through the duration of this "lockup". Once the Recv-Q drains into logstash and things get moving, we see normal YGC activity which is frequent but keeps things moving along just fine for us.

I wish it was pausing for GC but I can't find any evidence it is, since it seems blocking on the TCP socket coming into logstash.

It appears we run logstash v 1.5.4 using Oracle Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

/usr/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -d64 -Dfile.encoding=utf-8 -Dsun.jnu.encoding=utf-8 -XX:PermSize=128m -XX:MaxPermSize=128m -javaagent:/usr/lib/jvm/java-7-oracle/jre/lib/jolokia-jvm-1.2.3-agent.jar=host=localhost,port=8779,policyLocation=file:///usr/lib/jvm/java-7-oracle/jre/lib/jolokia-access.xml -XX:+UseCompressedOops -XX:+AlwaysPreTouch -XX:+ParallelRefProcEnabled -Djava.io.tmpdir=/tmp/logstash -Djava.security.properties=/etc/logstash/java.security -Xmx4096m -Xss2048k -Djffi.boot.library.path=/opt/logstash/vendor/jruby/lib/jni -Xbootclasspath/a:/opt/logstash/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/opt/logstash/vendor/jruby -Djruby.lib=/opt/logstash/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main --1.9 /opt/logstash/lib/bootstrap/environment.rb logstash/runner.rb agent -f /etc/logstash/conf.d -l /srv/log/logstash/logstash.log

Any pointers?

(Mark Walkom) #2

What's your config look like?

(Gary) #3

Here is the input portion:

input {
tcp {
mode => "server"
host => ""
port => 5000
codec => "json"

after that it goes through a few filters then it's off to Graylog:

output {
gelf {
host => ""
port => 12201

(Mark Walkom) #4

Are you sure it's not graylog, ie have you changed the output to be a file/stdout and watched the flow?

(Gary) #5

The GELF output is UDP. I sniff and see that when Graylog stops receiving messages, nothing is on the line. Blasting UDP shouldn't block, should it?

(Andrew Cholakian) #6

You're right that blasting TCP shouldn't block. Have you tried checking out the threads in VisualVM ,to see which threads are live and which are idle? A screenshot of that would be very useful

