Logstash pipeline worker stops processing events because of exception

Hello,

Currently our beats-dead-letter-queue-processing-pipeline is causing a exception which makes events stop outputting to Elasticsearch:

beats-dead-letter-queue-processing-pipeline] Pipeline worker error, the pipeline will be stopped {:pipeline_id=>"beats-dead-letter-queue-processing-pipeline", :error=>"(NoMethodError) undefined method `pop' for nil:NilClass", :exception=>Java::OrgJrubyExceptions::NoMethodError, :backtrace=>["usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.amazing_print_minus_1_dot_4_dot_0.lib.amazing_print.inspector.awesome(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/amazing_print-1.4.0/lib/amazing_print/inspector.rb:93)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.amazing_print_minus_1_dot_4_dot_0.lib.amazing_print.core_ext.kernel.ai(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/amazing_print-1.4.0/lib/amazing_print/core_ext/kernel.rb:11)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_codec_minus_rubydebug_minus_3_dot_1_dot_0.lib.logstash.codecs.rubydebug.encode_with_metadata(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-codec-rubydebug-3.1.0/lib/logstash/codecs/rubydebug.rb:42)", "org.jruby.RubyMethod.call(org/jruby/RubyMethod.java:119)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_cod

Not sure exactly what's going on here. Is this something within our beats_deadletterqueue_pipeline.conf that needs to be editied when it comes to filtering?

beats_deadletterqueue_pipeline.conf:

input {
  dead_letter_queue {
    path => "/usr/share/logstash/data/dead_letter_queue" 
    commit_offsets => true 
    pipeline_id => "beats" 
  }
}
filter {
    # First, we must capture the entire event, and write it to a new
    # field; we'll call that field `failed_message`
    ruby {
        code => "event.set('failed_message', event.to_json())"
    }

    # Next, we prune every field off the event except for the one we've
    # just created. Note that this does not prune event metadata.
    prune {
        whitelist_names => [ "^failed_message$" ]
    }

    # Next, convert the metadata timestamp to one we can parse with a
    # date filter. Before conversion, this field is a Logstash::Timestamp.
    # http://www.rubydoc.info/gems/logstash-core/LogStash/Timestamp
    ruby {
        code => "event.set('timestamp', event.get('[@metadata][dead_letter_queue][entry_time]').to_s())"
    }

    # Apply the date filter.
    date {
        match => [ "timestamp", "ISO8601" ]
    }

    # Pull useful information out of the event metadata provided by the dead
    # letter queue, and add it to the new event.
    mutate {
        add_field => {
            "message" => "%%{[@metadata][dead_letter_queue][reason]}"
            "plugin_id" => "%%{[@metadata][dead_letter_queue][plugin_id]}"
            "plugin_type" => "%%{[@metadata][dead_letter_queue][plugin_type]}"
        }
    }
}
output {
    stdout {
        codec => rubydebug { metadata => true }
    }
    if [@metadata][beat] {
        
        # Ship to current logging system (logs.dev.test.corp) - logstash index
        elasticsearch { 
            id => "dead-letter-beats-es-6-dynamic"
            hosts => ${elasticsearch_hosts} 
            index => "logstash-deadletter-%%{+yyyy.MM.dd}"
        }
    }  else {
        # Ship to current logging system (logs.dev.test.corp) - logstash index
        elasticsearch { 
            id => "dead-letter-beats-es-6-static"
            hosts => ${elasticsearch_hosts} 
            index => "logstash-deadletter-%%{+yyyy.MM.dd}"
        }
    }
}

Is it possible that your JVM is getting a java.lang.OutOfMemoryError immediately before this?

Yes, was seeing this as well:

java.lang.OutOfMemoryError: Java heap space

OK, so that's the issue you need to focus on. It's arguably a bug in awesome_print that it tries to pop an item off that array when the array is empty but it's also not worth worrying about.

Make sure you have -XX:+HeapDumpOnOutOfMemoryError set in your jvm.options. Reproduce the error, then analyze it with a dump analyzer (I really like Eclipse MAT) to get a leak suspect report.

Most likely almost all the memory will be anchored by Leak Suspect #1. It may or may not then be obvious what the problem is. Unless you are experienced it is unlikely you will find anything useful in the dump other than the leak suspect report.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.