Intermittent ArrayIndexOutOfBoundsException in dissect


#1

I am trying to load UK housing price data in a CSV from HM land registry. Column 4 in that CSV should be the postcode, but sometimes it is empty, and sometimes that causes dissect to throw an ArrayIndexOutOfBoundsException. It typically gets through 10,000 to 20,000 rows before failing. In all simpler test cases that i tried I get a _dissectfailure when working on a nil field. The obvious 'if [postcode]' does prevent the exception. This is in 6.0.0-rc1

input { stdin {} }
output { stdout { codec => dots } }

filter {
  csv {
    columns => [ "id", "price", "date", "postcode", "propertytype",
      "new", "duration", "paon", "saon", "street", "locality",
      "town", "district", "county", "category", "status" ]
    remove_field => [ "message", "host" ]
  }

  dissect {
    mapping => { "postcode" => "%{postcode1} %{postcode2}" }
  }
}

#2

I forgot the error messages...

There are several of these, which appears to be dissect handling things well.

[ERROR] 2017-10-17 12:55:27.729 [[main]>worker0] dissect - Dissect threw an exception {"backtrace"=>["org.logstash.dissect.ValueResolver.get(ValueResolver.java:18)", "org.logstash.dissect.fields.NormalField.append(NormalField.java:41)", "org.logstash.dissect.Dissector.dissect(Dissector.java:134)", "org.logstash.dissect.JavaDissectorLibrary$RubyDissect.dissect(JavaDissectorLibrary.java:113)", "org.logstash.dissect.JavaDissectorLibrary$RubyDissect.dissect_multi(JavaDissectorLibrary.java:140)", "org.logstash.dissect.JavaDissectorLibrary$RubyDissect$INVOKER$i$2$0$dissect_multi.call(JavaDissectorLibrary$RubyDissect$INVOKER$i$2$0$dissect_multi.gen)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_dissect_minus_1_dot_0_dot_12.lib.logstash.filters.dissect.invokeOther2:dissect_multi(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-dissect-1.0.12/lib/logstash/filters/dissect.rb:182)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_dissect_minus_1_dot_0_dot_12.lib.logstash.filters.dissect.RUBY$method$multi_filter$0(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-dissect-1.0.12/lib/logstash/filters/dissect.rb:182)", "usr.share.logstash.logstash_minus_core.lib.logstash.filter_delegator.invokeOther10:multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48)", "usr.share.logstash.logstash_minus_core.lib.logstash.filter_delegator.RUBY$method$multi_filter$0(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48)", "usr.share.logstash.logstash_minus_core.lib.logstash.pipeline.invokeOther4:filter_func(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:504)", "usr.share.logstash.logstash_minus_core.lib.logstash.pipeline.RUBY$block$filter_batch$0(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:504)", "usr.share.logstash.logstash_minus_core.lib.logstash.util.wrapped_synchronous_queue.invokeOther1:call(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:228)", "usr.share.logstash.logstash_minus_core.lib.logstash.util.wrapped_synchronous_queue.RUBY$block$each$0(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:228)"], "exception"=>"java.lang.NullPointerException"}

and then this, where the exception occurs at JavaDissectorLibrary.java:130 (cf. JavaDissectorLibrary.java:113 above)

[ERROR] 2017-10-17 12:55:36.470 [[main]>worker1] pipeline - Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash.[...]
Exception in thread "[main]>worker1" java.lang.ArrayIndexOutOfBoundsException: 0
        at org.logstash.dissect.DissectorErrorUtils.backtrace(DissectorErrorUtils.java:16)
        at org.logstash.dissect.JavaDissectorLibrary$RubyDissect.logException(JavaDissectorLibrary.java:224)
        at org.logstash.dissect.JavaDissectorLibrary$RubyDissect.dissect(JavaDissectorLibrary.java:130)
        at org.logstash.dissect.JavaDissectorLibrary$RubyDissect.dissect_multi(JavaDissectorLibrary.java:140)
        at org.logstash.dissect.JavaDissectorLibrary$RubyDissect$INVOKER$i$2$0$dissect_multi.call(JavaDissectorLibrary$RubyDissect$INVOKER$i$2$0$dissect_multi.gen)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:193)
        at usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_dissect_minus_1_dot_0_dot_12.lib.logstash.filters.dissect.invokeOther2:dissect_multi(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-dissect-1.0.12/lib/logstash/filters/dissect.rb:182)
        at usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_dissect_minus_1_dot_0_dot_12.lib.logstash.filters.dissect.RUBY$method$multi_filter$0(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-dissect-1.0.12/lib/logstash/filters/dissect.rb:182)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:103)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:163)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:200)
        at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:338)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:163)
        at usr.share.logstash.logstash_minus_core.lib.logstash.filter_delegator.invokeOther10:multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48)
        at usr.share.logstash.logstash_minus_core.lib.logstash.filter_delegator.RUBY$method$multi_filter$0(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:103)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:163)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:200)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:161)
        at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:314)
        at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:73)
        at org.jruby.ir.interpreter.Interpreter.INTERPRET_BLOCK(Interpreter.java:132)
        at org.jruby.runtime.MixedModeIRBlockBody.commonYieldPath(MixedModeIRBlockBody.java:148)
        at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:73)
        at org.jruby.runtime.Block.call(Block.java:124)
        at org.jruby.RubyProc.call(RubyProc.java:289)
[...]

Then it carries on processing for a while until we get the same ArrayIndexOutOfBoundsException in worker0, at which point logstash stops processing, since it has no workers working.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.