Logstash throws an error when parsing a huge log file

I have a logstash configuration thats reading files from a directory. The input file consists of a number of xml blocks. The filter looks for xml blocks and then tries to get a value of a particular element within the xml block. The configuration seems to work for small files but fails with an error when dealing with large files (in excess of 50 mb)

Here is my logstash.conf

input {
		file {
			path => "/usr/share/logstash/logs/*"
			codec => multiline {
			     pattern => "<SomeEvent.*>" 
			     negate => "true"
			     what => "previous"
			     max_lines => 1000 #xml block has lots and lots of elements
			}
		}
	}

filter{
     xml  {
	        source => "message"
	        store_xml => "false"     
			xpath => ["//ns9:ATag/text()", "tagType"]		
     }
    mutate {
       remove_field => [ "message" ]
    }
 }

output {
	elasticsearch {
		hosts => "elasticsearch:9200"
	}
	stdout {}
}																																			 

Here is a snippet of the stdout. There are lots of successful events that are pushed to ES before it fails with this error below. When I run the same file again, it fails at a different line (in other words, doesn't fail at the same line every-time)

logstash_1 | { logstash_1 | "@version" => "1", logstash_1 | "@timestamp" => 2019-06-07T02:50:19.582Z, logstash_1 | "tags" => [ logstash_1 | [0] "multiline" logstash_1 | ], logstash_1 | "host" => "c262525ef385", logstash_1 | "tagType" => [ logstash_1 | [0] "ASSIGN" logstash_1 | ], logstash_1 | "path" => "/usr/share/logstash/logs/failing.xml" logstash_1 | } [2019-06-07T02:50:19,980][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"//ns9:ATag/text()", "backtrace"=>["nokogiri/XmlXpathContext.java:128:inevaluate'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:198:in xpath_impl'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:179:inxpath_internal'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:154:in xpath'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.6/lib/logstash/filters/xml.rb:153:inblock in filter'", "org/jruby/RubyHash.java:1343:in each'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.6/lib/logstash/filters/xml.rb:152:infilter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:143:in do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:162:inblock in multi_filter'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:inmulti_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:115:in multi_filter'", "(eval):71:inblock in filter_func'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:341:in filter_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:320:inworker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:287:in block in start_workers'"], :thread=>"#<Thread:0x2bd6e652 sleep>"}

I would appreciate any insight into this, Thanks
K

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.