I have a logstash configuration thats reading files from a directory. The input file consists of a number of xml blocks. The filter looks for xml blocks and then tries to get a value of a particular element within the xml block. The configuration seems to work for small files but fails with an error when dealing with large files (in excess of 50 mb)
Here is my logstash.conf
input {
file {
path => "/usr/share/logstash/logs/*"
codec => multiline {
pattern => "<SomeEvent.*>"
negate => "true"
what => "previous"
max_lines => 1000 #xml block has lots and lots of elements
}
}
}
filter{
xml {
source => "message"
store_xml => "false"
xpath => ["//ns9:ATag/text()", "tagType"]
}
mutate {
remove_field => [ "message" ]
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
stdout {}
}
Here is a snippet of the stdout. There are lots of successful events that are pushed to ES before it fails with this error below. When I run the same file again, it fails at a different line (in other words, doesn't fail at the same line every-time)
logstash_1 | { logstash_1 | "@version" => "1", logstash_1 | "@timestamp" => 2019-06-07T02:50:19.582Z, logstash_1 | "tags" => [ logstash_1 | [0] "multiline" logstash_1 | ], logstash_1 | "host" => "c262525ef385", logstash_1 | "tagType" => [ logstash_1 | [0] "ASSIGN" logstash_1 | ], logstash_1 | "path" => "/usr/share/logstash/logs/failing.xml" logstash_1 | } [2019-06-07T02:50:19,980][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"//ns9:ATag/text()", "backtrace"=>["nokogiri/XmlXpathContext.java:128:in
evaluate'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:198:in xpath_impl'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:179:in
xpath_internal'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.10.0-java/lib/nokogiri/xml/searchable.rb:154:in xpath'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.6/lib/logstash/filters/xml.rb:153:in
block in filter'", "org/jruby/RubyHash.java:1343:in each'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.6/lib/logstash/filters/xml.rb:152:in
filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:143:in do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:162:in
block in multi_filter'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:in
multi_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:115:in multi_filter'", "(eval):71:in
block in filter_func'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:341:in filter_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:320:in
worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:287:in block in start_workers'"], :thread=>"#<Thread:0x2bd6e652 sleep>"}
I would appreciate any insight into this, Thanks
K