CRASHING: source sequence is illegal/malformed utf-8


(Alejandro Olivan) #1

Hi forum!

I'm experimenting incresing number of logstash crashing with the following error log:

JSON::GeneratorError: source sequence is illegal/malformed utf-8
to_json at json/ext/GeneratorMethods.java:71
to_json at /opt/logstash/lib/logstash/event.rb:148
receive at /opt/logstash/lib/logstash/outputs/redis.rb:158
handle at /opt/logstash/lib/logstash/outputs/base.rb:86
initialize at (eval):647
call at org/jruby/RubyProc.java:271
output at /opt/logstash/lib/logstash/pipeline.rb:266
outputworker at /opt/logstash/lib/logstash/pipeline.rb:225
start_outputs at /opt/logstash/lib/logstash/pipeline.rb:152

Increasing the restarting cron rate of my logstash shippers, is useless, because once the log file appears to contain conflicting text (I guess some malformed text) logstash crashes... so no joy until end of conflicting log generation followed by logrotation (...a new clean log file, basically)

The problem seems to be related to some kind of UserAgent fields on logs at which bizarre strings are usually found. Most of the time everything goes smooth, but I feel the whole stack is vulnerable if no handling of this situation can be done:
A single computer can connect to all my servers, write a conflicting logline on all fo them, and let all logstash instances crash...

The issue is not new... there are entries about this googling around from 2012 or so...
I'm running latest 1.4.5 logstash and overall it goes well...
Trying to set charset at the input (as suggested googling around) as proven useless for me... so its time to ask the gurus...

Has some one dealt with this sucessfully?

Thank you very much.
Best regards


(Alejandro Olivan) #2

Hi guys

After suffering this for months without visible solution (at least in 1.4.5) I have found what seems to be a definitive solution... Early testings seems to confirm that applying this solves the problem.

All credits go to someone (Chinese I guess) I found ggogling around looking for accurence of this trouble
http://www.n0tr00t.com/2015/04/18/dataminding-logstash.html

The problem , as stated by logstash error file is at line 148 of file /opt/logstash/lib/logstash/event.rb

So here is the edition:

Original function found at line 148 that has to be commented commented out / replaced

public
def to_json(args)
return @data.to_json(args)
end # def to_json

Replacement funtion

public
def to_json(args)
begin
return @data.to_json(args)
rescue
@data = {}
return @data.to_json()
end
end

I'm not a ruby programmer... but it seems to me that that "rescue" statement seems to handle the situation (it appears to me a kind of Exception handling).
This problem has been completely disable logstash to be useable to be used with streaming servers because of logs containing conflicting chars.

Logstash runs fine without being denied a whole day untill logrotation any more!!!!

Hope someone from Elasticsearch/Logstash team would read this and communicate to developers to analysis...

Regards.


(system) #3