Logstash Out of memory

Has anyone experience this when trying to run logstash with the jdbc input plugin

../bin/logstash -f logstash.conf
Logstash startup completed
Error: Your application used more memory than the safety cap of 500M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

What is going on?

I am getting this with XML plugin. And my XMLs are very large at times. Like in few MBs. ( 10-20M or more).

I was not able to reproduce this using sqlite3 jdbc input and output to file. In my test, the size of the database was over 1GB, and during the test i did not observe any memory leak. Perhaps something else is going on with the JDBC driver, or the output.

@michaellizhou Would you provide some more details about your configuration?

Thanks,

Jay

@asatsi I would like to reproduce this. Are you able to provide a sample of your XMLs and your logstash configuration? If so , please provide via gist.

Thanks,

Jay

@asatsi odd I do not think that 10-20 M should do this, I am running 1000's of files an hour that are each about 10mb each. Please share your use case.

@PhaedrusTheGreek the data set is not that large anywhere between 50-100 mb. I think its more of logstash issue rather then jdbc issue because its just a stream otherwise its an ordinary input config :

jdbc {
    jdbc_driver_library => "ojdbc7.jar"
    jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
    jdbc_connection_string => "jdbc:oracle:thin:@//"
    jdbc_user => ""
    jdbc_password => ""
    statement => "SELECT to_char(creation_time,'YYYY-MM-DD HH24:MI:SSxFF3') as creation_time from log where to_char(creation_time,'YYYY-MM-DD HH24:MI:SSxFF3') > to_char(:sql_last_start,'YYYY-MM-DD HH24:MI:SSxFF3')"
    record_last_run => true
    schedule => "* * * * *"
    last_run_metadata_path => "log_sincedb"

Hi,

Thanks for the reply @PhaedrusTheGreek

The gist is available here: https://gist.github.com/asatsi/330e5c23830752d53bee

This is the stacktrace I am getting with LS 1.5.3:
.........\n Sony\n</ns0:LogRequest>", :exception=>#<REXML::ParseException: #<REXML::ParseException: java.lang.OutOfMemoryError: Java heap space
org.joni.StackMachine.ensure1(StackMachine.java:98)
org.joni.StackMachine.push(StackMachine.java:162)
org.joni.StackMachine.pushAlt(StackMachine.java:200)
org.joni.ByteCodeMachine.opPush(ByteCodeMachine.java:1517)
org.joni.ByteCodeMachine.matchAt(ByteCodeMachine.java:272)
org.joni.Matcher.matchCheck(Matcher.java:304)
org.joni.Matcher.searchInterruptible(Matcher.java:480)
org.jruby.RubyRegexp$SearchMatchTask.run(RubyRegexp.java:273)
org.jruby.RubyThread.executeBlockingTask(RubyThread.java:1065)
org.jruby.RubyRegexp.matcherSearch(RubyRegexp.java:235)
org.jruby.RubyRegexp.search19(RubyRegexp.java:1780)
org.jruby.RubyRegexp.matchPos(RubyRegexp.java:1720)
org.jruby.RubyRegexp.match19Common(RubyRegexp.java:1701)
org.jruby.RubyRegexp.match_m19(RubyRegexp.java:1680)
org.jruby.RubyRegexp$INVOKER$i$match_m19.call(RubyRegexp$INVOKER$i$match_m19.gen)
org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:354)
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)
org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:225)
org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:219)
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
org.jruby.ast.CallTwoArgNode.interpret(CallTwoArgNode.java:59)
org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
org.jruby.ast.IfNode.interpret(IfNode.java:118)
org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
...
Exception parsing
Line: 425
Position: 21817613
Last 80 unconsumed characters:
<Enter the Dragon:EnterpriseProductCatalog xmlns:xsi="http://www.w3.org/2001/X>
/opt/elk-demo/logstash-1.5.3/vendor/jruby/lib/ruby/1.9/rexml/parsers/baseparser.rb:435:in pull_event' /opt/elk-demo/logstash-1.5.3/vendor/jruby/lib/ruby/1.9/rexml/parsers/baseparser.rb:183:inpull'
/opt/elk-demo/logstash-1.5.3/vendor/jruby/lib/ruby/1.9/rexml/parsers/treeparser.rb:22:in parse' /opt/elk-demo/logstash-1.5.3/vendor/jruby/lib/ruby/1.9/rexml/document.rb:249:inbuild'
/opt/elk-demo/logstash-1.5.3/vendor/jruby/lib/ruby/1.9/rexml/document.rb:43:in initialize' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xmlsimple.rb:971:inparse'
/opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xmlsimple.rb:164:in xml_in' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xmlsimple.rb:203:inxml_in'
/opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-filter-xml-1.0.0/lib/logstash/filters/xml.rb:129:in filter' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/filters/base.rb:163:inmulti_filter'
org/jruby/RubyArray.java:1613:in each' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/filters/base.rb:160:inmulti_filter'
(eval):24:in initialize' org/jruby/RubyProc.java:271:incall'
/opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:311:in flush_filters' org/jruby/RubyArray.java:1613:ineach'
/opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:310:in flush_filters' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:319:inflush_filters_to_output!'
/opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:223:in filterworker' /opt/elk-demo/logstash-1.5.3/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:157:instart_filters'
...
java.lang.OutOfMemoryError: Java heap space

Here's the logstash configuration:
input
{
file
{
path => "/tmp/xml.log"
}
}

filter
{
multiline
{
pattern => "^<ns0:"
negate => true
what => previous
}
xml
{
source => [ "message" ]
target => [ "x" ]
}
}

output
{
stdout
{
codec => rubydebug
}
}

FYI the XML OOM issues is also here Large XML crashes logstash with OOM

Should I raise a bug for this?

No, just continue things in your thread.

updates:

Now I find the issue is when data starts to be shipped over to logstash. In my configuration I have 2 jdbc inputs and the rest are tcp inputs. Logstash pulls everything from db without a problem but when I turn on a shipper this message will show up:

Logstash startup completed
Error: Your application used more memory than the safety cap of 500M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

Edit: sometimes I start up logstash and this immediately occurs...

@michaellizhou Can I see your filter configuration ? Some filters like Multiline have memory-intensive buffers.

This error is just all over the place. I am currently running fine, but I am sure this may occur again. Will it be safe to just increase memory cap size? Multiline is done with shipper so there are none in logstash. I have the general grok and timestamp match. Basically 7 of these conditional statements:

grok {
      match => [
        "message" , "(?<logStoreTime>\d{2}:\d{2}:\d{2},\d{3})\s+(?<messageLevel>\w+)\s+\[(?<app>[\[\]\/_()A-Za-z0-9.$\-:]+)\]\s+%{GREEDYDATA:message}"
      ]
      overwrite => ["message"]
      # temporary fix for logs that do not have year month day
      add_field => {"logdate" => "%{+YYYY-MM-dd} %{logStoreTime}"}
    }

    date {
      locale => "en"
      timezone => "America/New_York"
      match => [
        "logdate" , "YYYY-MM-dd HH:mm:ss,SSS"
      ]
    }

@michaellizhou. Can I see:
bin/logstash --version
bin/plugin list --verbose

It should be safe to increase LS_HEAP_SIZE to "1000m" to start, but I recommend you watch for any differences in performance.

./bin/logstash --version
logstash 1.5.0

and

./bin/plugin list --verbose
logstash-codec-collectd (1.0.1)
logstash-codec-dots (1.0.0)
logstash-codec-edn (1.0.0)
logstash-codec-edn_lines (1.0.0)
logstash-codec-es_bulk (1.0.0)
logstash-codec-fluent (1.0.0)
logstash-codec-graphite (1.0.0)
logstash-codec-json (1.0.1)
logstash-codec-json_lines (1.0.1)
logstash-codec-line (1.0.0)
logstash-codec-msgpack (1.0.0)
logstash-codec-multiline (1.0.0)
logstash-codec-netflow (1.0.0)
logstash-codec-oldlogstashjson (1.0.0)
logstash-codec-plain (1.0.0)
logstash-codec-rubydebug (1.0.0)
logstash-filter-anonymize (1.0.0)
logstash-filter-checksum (1.0.1)
logstash-filter-clone (1.0.0)
logstash-filter-csv (1.0.0)
logstash-filter-date (1.0.0)
logstash-filter-dns (1.0.0)
logstash-filter-drop (1.0.0)
logstash-filter-fingerprint (1.0.0)
logstash-filter-geoip (1.0.2)
logstash-filter-grok (1.0.0)
logstash-filter-json (1.0.1)
logstash-filter-kv (1.0.0)
logstash-filter-metrics (1.0.0)
logstash-filter-multiline (1.0.0)
logstash-filter-mutate (1.0.2)
logstash-filter-ruby (1.0.0)
logstash-filter-sleep (1.0.0)
logstash-filter-split (1.0.0)
logstash-filter-syslog_pri (1.0.0)
logstash-filter-throttle (1.0.0)
logstash-filter-urldecode (1.0.0)
logstash-filter-useragent (1.0.1)
logstash-filter-uuid (1.0.0)
logstash-filter-xml (1.0.0)
logstash-input-couchdb_changes (1.0.0)
logstash-input-courier (1.8.1)
logstash-input-elasticsearch (1.0.2)
logstash-input-eventlog (1.0.0)
logstash-input-exec (1.0.0)
logstash-input-file (1.0.1)
logstash-input-ganglia (1.0.0)
logstash-input-gelf (1.0.0)
logstash-input-generator (1.0.0)
logstash-input-graphite (1.0.0)
logstash-input-heartbeat (1.0.0)
logstash-input-imap (1.0.0)
logstash-input-irc (1.0.0)
logstash-input-jdbc (1.0.0)
logstash-input-kafka (1.0.0)
logstash-input-log4j (1.0.0)
logstash-input-lumberjack (1.0.5)
logstash-input-pipe (1.0.0)
logstash-input-rabbitmq (1.1.1)
logstash-input-redis (1.0.3)
logstash-input-s3 (1.0.0)
logstash-input-snmptrap (1.0.0)
logstash-input-sqs (1.1.0)
logstash-input-stdin (1.0.0)
logstash-input-syslog (1.0.1)
logstash-input-tcp (1.0.0)
logstash-input-twitter (1.0.1)
logstash-input-udp (1.0.0)
logstash-input-unix (1.0.0)
logstash-input-xmpp (1.0.0)
logstash-input-zeromq (1.0.0)
logstash-output-cloudwatch (1.0.0)
logstash-output-csv (1.0.0)
logstash-output-elasticsearch (1.0.7)
logstash-output-elasticsearch_http (1.0.0)
logstash-output-email (1.0.0)
logstash-output-exec (1.0.0)
logstash-output-file (1.0.0)
logstash-output-ganglia (1.0.0)
logstash-output-gelf (1.0.0)
logstash-output-graphite (1.0.2)
logstash-output-hipchat (1.0.0)
logstash-output-http (1.1.0)
logstash-output-irc (1.0.0)
logstash-output-juggernaut (1.0.0)
logstash-output-kafka (1.0.0)
logstash-output-lumberjack (1.0.2)
logstash-output-nagios (1.0.0)
logstash-output-nagios_nsca (1.0.0)
logstash-output-null (1.0.1)
logstash-output-opentsdb (1.0.0)
logstash-output-pagerduty (1.0.0)
logstash-output-pipe (1.0.0)
logstash-output-rabbitmq (1.1.2)
logstash-output-redis (1.0.0)
logstash-output-s3 (1.0.0)
logstash-output-sns (2.0.1)
logstash-output-sqs (1.0.0)
logstash-output-statsd (1.1.0)
logstash-output-stdout (1.0.0)
logstash-output-tcp (1.0.0)
logstash-output-udp (1.0.0)
logstash-output-xmpp (1.0.0)
logstash-output-zeromq (1.0.0)
logstash-patterns-core (0.3.0)

There was a TCP Input memory leak before 1.5.0, but it looks like that is not an issue here. It's possible that increased heap is necessary for your normal operations.

Could it be the actual plugin issue? Because I updated my plugins recently and I do not see this issue any more. I will keep testing this to see if I see any memory issues. Thanks for the detail on the TCP

After you did your update, there would be some output like this:

Updated logstash-filter-mutate 1.0.1 to 1.0.2
Updated logstash-input-elasticsearch 1.0.0 to 1.0.2
Updated logstash-input-http 1.0.2 to 1.0.3
Updated logstash-input-lumberjack 1.0.4 to 1.0.5
Updated logstash-input-rabbitmq 1.1.0 to 1.1.1
Updated logstash-input-sqs 1.0.0 to 1.1.0
Updated logstash-output-http 1.0.0 to 1.1.0
Updated logstash-output-null 1.0.0 to 1.0.1
Updated logstash-output-rabbitmq 1.1.1 to 1.1.2

Do you have a copy of it?

I looked at what has been updated no input or output plugins were updated. Only codec and filter. Strange can't be the tcp thing I guess...

Please let me know which plugins were updated. This will be useful to know, especially if the issue is solved.

List but I am still not sure if issue is solved. What I know for now is that I don't see the error, let's see if it occurs again have only been up for 1-2 hours.

Updated logstash-codec-collectd 0.1.9 to 1.0.1
    Updated logstash-codec-dots 0.1.6 to 1.0.0
    Updated logstash-codec-edn 0.1.6 to 1.0.0
    Updated logstash-codec-edn_lines 0.1.7 to 1.0.0
    Updated logstash-codec-es_bulk 0.1.6 to 1.0.0
    Updated logstash-codec-fluent 0.1.6 to 1.0.0
    Updated logstash-codec-graphite 0.1.6 to 1.0.0
    Updated logstash-codec-json 0.1.7 to 1.0.1
    Updated logstash-codec-json_lines 0.1.8 to 1.0.1
    Updated logstash-codec-line 0.1.6 to 1.0.0
    Updated logstash-codec-msgpack 0.1.7 to 1.0.0
    Updated logstash-codec-multiline 0.1.9 to 1.0.0
    Updated logstash-codec-netflow 0.1.6 to 1.0.0
    Updated logstash-codec-oldlogstashjson 0.1.6 to 1.0.0
    Updated logstash-codec-plain 0.1.6 to 1.0.0
    Updated logstash-codec-rubydebug 0.1.7 to 1.0.0
    Updated logstash-filter-anonymize 0.1.5 to 1.0.0
    Updated logstash-filter-checksum 0.1.6 to 1.0.1
    Updated logstash-filter-clone 0.1.5 to 1.0.0
    Updated logstash-filter-csv 0.1.5 to 1.0.0
    Updated logstash-filter-date 0.1.6 to 1.0.0
    Updated logstash-filter-dns 0.1.5 to 1.0.0
    Updated logstash-filter-drop 0.1.5 to 1.0.0
    Updated logstash-filter-fingerprint 0.1.5 to 1.0.0
    Updated logstash-filter-geoip 0.1.9 to 1.0.2
    Updated logstash-filter-grok 0.1.10 to 1.0.0
    Updated logstash-filter-json 0.1.6 to 1.0.1
    Updated logstash-filter-kv 0.1.6 to 1.0.0
    Updated logstash-filter-metrics 0.1.8 to 1.0.0
    Updated logstash-filter-multiline 0.1.6 to 1.0.0
    Updated logstash-filter-mutate 0.1.8 to 1.0.2
    Updated logstash-filter-ruby 0.1.5 to 1.0.0
    Updated logstash-filter-sleep 0.1.5 to 1.0.0
    Updated logstash-filter-split 0.1.6 to 1.0.0
    Updated logstash-filter-syslog_pri 0.1.5 to 1.0.0
    Updated logstash-filter-throttle 0.1.5 to 1.0.0