Errors with parsing json

Hi
My pipeline works for shorter json file but when it was harnessed on the production files it has been crashed. I've enclosed a log files from logstash to present output.


[WARN ] 2022-11-17 18:08:03.585 [[ip40]>worker0] split - Only String and Array types are splittable. field:measData is of type = NilClass
[ERROR] 2022-11-17 18:08:03.586 [[ip40]>worker0] ruby - Ruby exception occurred: undefined method `each' for nil:NilClass {:class=>"NoMethodError", :backtrace=>["(ruby filter code):3:in `block in filter_method'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:96:in `inline_script'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:89:in `filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:in `do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:178:in `block in multi_filter'", "org/jruby/RubyArray.java:1821:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:175:in `multi_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:134:in `multi_filter'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:300:in `block in start_workers'"]}
[WARN ] 2022-11-17 18:08:03.586 [[ip40]>worker0] split - Only String and Array types are splittable. field:[measData][measInfo] is of type = NilClass
[ERROR] 2022-11-17 18:08:03.587 [[ip40]>worker0] ruby - Ruby exception occurred: undefined method `each' for nil:NilClass {:class=>"NoMethodError", :backtrace=>["(ruby filter code):3:in `block in filter_method'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:96:in `inline_script'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-ruby-3.1.8/lib/logstash/filters/ruby.rb:89:in `filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159:in `do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:178:in `block in multi_filter'", "org/jruby/RubyArray.java:1821:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:175:in `multi_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:134:in `multi_filter'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:300:in `block in start_workers'"]}
{
    "@timestamp" => 2022-11-17T18:08:03.565179Z,
          "path" => "/opt/data/input/ip40_for_logstash/test/input1.json",
          "host" => "0.0.0.0",
          "tags" => [
        [0] "_jsonparsefailure",
        [1] "_split_type_failure",
        [2] "_rubyexception"

I can also share some input files for validate this pipeline.

input{
        file {
                codec => multiline { pattern => "^}" negate => true what => next max_lines => 20000 auto_flush_interval => 4 multiline_tag => "" }
                        path => "/opt/data/input/IP40_for_logstash/*.json"
                        sincedb_path => "/dev/null"
                        start_position => beginning
                        file_completed_action => "log"
                        file_completed_log_path => "/opt/data/logstash_files/fin_eir.log"
                        mode => read
        }
}


filter{
        mutate { gsub => ["message", "\n", ""] }
        json{ source => "message" }
        split{ field => "measData" }
        date {
        match => ["[measFileHeader][collectionBeginTime]", "yyyyMMddHHmmZ"]
        timezone => "Europe/Paris"
        target => "@timestamp"
            }
mutate { remove_field => [ "[measFileHeader][collectionBeginTime]", "[measData][measInfo][gronularityPeriod]", "[measData][measInfo][0][measStortTime]" ] }

    ruby {
        code => '
            event.get("[measData][nEId]").each { |k, v|
                event.set(k,v)
            }
            event.remove("[measData][nEId]")
        '
    }
split {field => "[measData][measInfo]"}
ruby { code => '
                event.get("[measData][measInfo]").each { |k, v|
                event.set(k,v)
                }
                event.remove("[measData][measInfo]")
                '
        }

 if "_jsonparsefailure" not in [tags] { mutate { remove_field => ["message","host","path","measData", "nEDistinguishedNome", "measStortTime", "collectionBeginTime", "measFileFooter", "measFileHeader", "@version"] }  }
}


on the end of output file it got max_bytes_length_exceeded_exception.

"@timestamp"=>2022-11-17T17:42:32.517700Z}], :response=>{"index"=>{"_index"=>"logstash-IP40-2022.11.17", "_id"=>"6D2vhoQBl_Aqej60Z-28", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Document contains at least one immense term in field=\"message\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[9, 9, 9, 9, 9, 34, 67, 55, 83, 67, 83, 85, 66, 83, 89, 83, 46, 55, 46, 67, 79, 77, 83, 71, 84, 69, 82, 77, 34, 58]...', original message: bytes can be at most 32766 in length; got 175906", "caused_by"=>{"type"=>"max_bytes_length_exceeded_exception", "reason"=>"max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 175906"}}}}}

This topic can be closed, I found the reason (it was limited by line size) therefore it didn't match the final pattern.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.