Logstash stops receiving logs

Hi everyone. I am a beginner in logstash and ELK in general. Lately I have been facing an issue with logstash.

I am receiving some logs on port 3014, and this is the second time that this is happening and I identified this pattern. Every time that a log that can't be decoded hits on port 3014, logstash stops processing any logs. What I mean? I will try to explain this as clear as I can, and I apologise if I miss something.

My logstash configuration looks like this:

input {
  syslog {
    port => 3014
    codec => cef
    syslog_field => "syslog"
    grok_pattern => "<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp}"
 }
}
filter {
    prune {
        whitelist_names => ["@timestamp", "message", "name","destinationUserName","sourceUserName","ad.loginName", "sourceServiceName","ad.destinationHosts","userID", "deviceAction", "deviceEventClassId"]
}
    mutate { gsub => [ "ad.loginName", "USERNM[\\]", "" ] }
    if [destinationUserName] and [sourceUserName] {
    mutate { add_field => { "userID" => "%{ad.loginName}" } }
}   else if [destinationUserName] {
    mutate { add_field => { "userID" => "%{destinationUserName}" } }
} else if [sourceUserName] {
    mutate { add_field => { "userID" => "%{sourceUserName}" } }
}
}
output {
  elasticsearch {
     hosts => ["localhost:9200"]
         index => "logstash_index"
 }
  stdout {
     codec => rubydebug

Now is the second time that when I see this log

That contains some gibberish text, logstash stops receiving any sort of data on port 3014.

The only way that I can start receiving the logs again, is if I restart the vm.

I was wondering if this I a case on which logstash doesn't know how to deal with this specific event and it throws an error and stops receiving logs and if there is any workaround this issue.

Thank you very much for your time and help and I apologise again for the basic question

What do the Logstash logs show?

Hello. Thank you so much for you reply. This is the log regarding that error:

[2021-10-17T23:11:34,974][ERROR][logstash.javapipeline    ][main] Pipeline worker error, the pipeline will be stopped {:pipeline_id=>"main", :error=>"(ArgumentError) invalid byte sequence in UTF-8", :exception=>Java::OrgJrubyExceptions::ArgumentError, :backtrace=>["org.jruby.RubyString.count(org/jruby/RubyString.java:5386)", "uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.csv.init_separators(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/csv.rb:2095)", "org.jruby.RubyArray.map(org/jruby/RubyArray.java:2577)", "uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.csv.<<(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/csv.rb:1698)", "uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.csv.generate_line(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/csv.rb:1200)", "uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.csv.to_csv(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/csv.rb:2343)", "C_3a_.logstash_minus_7_dot_12_dot_1.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_output_minus_csv_minus_3_dot_0_dot_8.lib.logstash.outputs.csv.event_to_csv(C:/logstash-7.12.1/vendor/bundle/jruby/2.5.0/gems/logstash-output-csv-3.0.8/lib/logstash/outputs/csv.rb:59)", "C_3a_.logstash_minus_7_dot_12_dot_1.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_output_minus_csv_minus_3_dot_0_dot_8.lib.logstash.outputs.csv.multi_receive_encoded(C:/logstash-7.12.1/vendor/bundle/jruby/2.5.0/gems/logstash-output-csv-3.0.8/lib/logstash/outputs/csv.rb:41)", "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1809)", "C_3a_.logstash_minus_7_dot_12_dot_1.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_output_minus_csv_minus_3_dot_0_dot_8.lib.logstash.outputs.csv.multi_receive_encoded(C:/logstash-7.12.1/vendor/bundle/jruby/2.5.0/gems/logstash-output-csv-3.0.8/lib/logstash/outputs/csv.rb:39)", "C_3a_.logstash_minus_7_dot_12_dot_1.logstash_minus_core.lib.logstash.outputs.base.multi_receive(C:/logstash-7.12.1/logstash-core/lib/logstash/outputs/base.rb:103)", "org.logstash.config.ir.compiler.OutputStrategyExt$AbstractOutputStrategyExt.multi_receive(org/logstash/config/ir/compiler/OutputStrategyExt.java:143)", "org.logstash.config.ir.compiler.AbstractOutputDelegatorExt.multi_receive(org/logstash/config/ir/compiler/AbstractOutputDelegatorExt.java:121)", "C_3a_.logstash_minus_7_dot_12_dot_1.logstash_minus_core.lib.logstash.java_pipeline.start_workers(C:/logstash-7.12.1/logstash-core/lib/logstash/java_pipeline.rb:295)"], :thread=>"#<Thread:0x5ddabec sleep>"}

As I was thinking the error is due to :error=>"(ArgumentError) invalid byte sequence in UTF-8". How can I solve this issue with logstash?

Thank you very much

That error is in a csv output. The configuration you say you are running does not include a csv output, so obviously you are not running the configuration that you say you are.

The cef codec is careful to ensure that the event is valid UTF-8, so I doubt the event that blew up the csv output came from a cef codec.

Without knowing where the event came from it is hard to say how to fix it. You could probably write a ruby filter that would repair an event that contains non-UTF-8 data.

Hi badger. I am so sorry, as I thought (wrongly) that the error was on the first half of the configurations, I didn't include the entire file. this is the full configuration

input {
  syslog {
    port => 3014
    codec => cef
    syslog_field => "syslog"
    grok_pattern => "<%{POSINT:priority}>%{SYSLOGTIMESTAMP:timestamp}"
 }
}
filter {
    prune {
        whitelist_names => ["@timestamp", "message", "name","destinationUserName","sourceUserName","ad.loginName", "sourceServiceName","ad.destinationHosts","userID", "deviceAction", "deviceEventClassId"]
}
    mutate { gsub => [ "ad.loginName", "USERNM[\\]", "" ] }
    if [destinationUserName] and [sourceUserName] {
    mutate { add_field => { "userID" => "%{ad.loginName}" } }
}   else if [destinationUserName] {
    mutate { add_field => { "userID" => "%{destinationUserName}" } }
} else if [sourceUserName] {
    mutate { add_field => { "userID" => "%{sourceUserName}" } }
}
}
output {
  elasticsearch {
     hosts => ["localhost:9200"]
         index => "logstash_index"
 }
  stdout {
     codec => rubydebug
 }
  http {
    url => "http://localhost/User/"
    http_method => "post"
    content_type => "application/json"
    format => "json"
    
}

 file {
    path => "C:\Path\to\Desktop\User\user-%{+yyyy.MM.dd}.json"
    codec => json
}
 csv {
    path => "C:\Path\to\Desktop\User\user-%{+yyyy.MM.dd}.json.csv"
    csv_options => {
        "write_headers" => true
        "headers" => ["@timestamp", "message","ad.loginName", "sourceServiceName","ad.loginName", "sourceServiceName","ad.destinationHosts", "name","userID", "deviceAction","deviceEventClassId"]

}
    fields => ["@timestamp", "message", "name", "userID", "deviceAction"]
 }
}

This is the whole configuration. Actually I don't need to write the logs to a csv as I have the json which is more than enough. But if I would like to have a ruby filter to repair the event that contains a nn-UTF-8 data, just to prevent logstash from crashing, how would I do that?.

I am extremely sorry for those basic question, but I am hitting the bottom hard.

Thank you to all of you for your patience with a newbie

If the bad encoding is in the message field, which seems likely as the codec verifies that it will be generating valid UTF-8 fields, you could try

ruby { code => 'event.set("message", event.get("message".scrub()))

That will replace invalid byte sequences with the Unicode replacement character �. If you want to use something else you could try

ruby { code => 'event.set("message", event.get("message".scrub("*")))

Note that editing the title of this thread to "Invalid UTF-8 encoding stops csv output" might help others with the same problem find it.

Thank you very much Badger. Just to make sure I understand how to configure this. The block of code you said, goes inside the csv right?.. like this?

 csv {
    ruby { code => 'event.set("message", event.get("message".scrub()))
    path => "C:\Path\to\Desktop\User\user-%{+yyyy.MM.dd}.json.csv"
    csv_options => {
        "write_headers" => true
        "headers" => ["@timestamp", "message","ad.loginName", "sourceServiceName","ad.loginName", "sourceServiceName","ad.destinationHosts", "name","userID", "deviceAction","deviceEventClassId"]

}
    fields => ["@timestamp", "message", "name", "userID", "deviceAction"]
 }

Yes I will edit the title as you suggested. Thank you so so so so much

No, the ruby filter needs to be in the filter {} section, at the same level as the prune.

Note that it will modify the [message] field sent to all of the outputs. If you want to only modify the [message] sent to the csv output you would have to use multiple pipelines with the forked-path pattern.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.