Looking for several pointers here, as I'm trying to create a logstash codec plugin to solve a problem in one of our data streams.
First, the data stream in question has two or 4 "binary data" bytes that it mixes in with other data.. Those binary bytes make logstash very unhapppy to the point that the data= field below, instead of being completely converted to json, either loses characters in field names or data, OR, throws a _jsonparseerror. I have checked the data, and several other json-lint tools have no issue with the data presented below.
UCS0000000000�UPLOAD_INLINE;type={vq};rto={15};data={"uploadTime": "1970-01-01T00:03:10+0530", "report": {"LocalMetrics": {"Signal": {"RERL": "127"}, "BurstGapLoss": {"BLD": "0", "GLD": "0.0", "BD": "0", "GD": "28400", "GMIN": "16"}, "PacketLoss": {"NLR": "0", "JDR": "0"}, "SessionDesc": {"PPS": "50", "PT": "9", "SSUP": "SSUP=off"}, "Timestamps": {"START": "1970-01-01T00:02:41+0530", "STOP": "1970-01-01T00:03:10+0530"}, "QualityEst": {"RCQ": "93", "RLQ": "94", "MOSLQ": "3.8", "MOSCQ": "3.8"}, "JitterBuffer": {"JBM": "70", "JBA": "3", "JBR": "5", "JBN": "50", "JBX": "240"}, "Delay": {"RTD": "19", "ESD": "75", "OWD": "70", "IAJ": "2"}}, "RemoteMetrics": {"Signal": {"RERL": "127"}, "BurstGapLoss": {"BLD": "0", "GLD": "0.0", "BD": "0", "GD": "24900", "GMIN": "16"}, "PacketLoss": {"NLR": "0", "JDR": "0"}, "SessionDesc": {"PPS": "50", "PT": "9", "SSUP": "SSUP=off"}, "Timestamps": {"START": "1970-01-01T00:02:41+0530", "STOP": "1970-01-01T00:03:10+0530"}, "QualityEst": {"RCQ": "93", "MOSLQ": "3.8", "MOSCQ": "3.8"}, "JtterBuffer": {"JBM": "40", "JBA": "3", "JBR": "5", "JBN": "20", "JBX": "240"}, "Delay": {"RTD": "43", "ESD": "47", "IAJ": "0"}}}, "VQReportType": "VQIntervalReport"}
So I figured if I could remove the binary bytes before it gets into the main logstash filter, that would prevent my issue from occuring. So I attempted to create a logstash codec plugin to do this and have run into 2 issues.
- No data ever exits this codec
- I can't find a way to see what's going on IN the codec to try and troubleshoot the problem
Any suggestions on this?
Plugin Code:
# encoding: utf-8
require "logstash/codecs/base"
require "logstash/codecs/line"
require "logstash/namespace"
# This codec will append a string to the message field
# of an event, either in the decoding or encoding methods
#
# This is only intended to be used as an example.
#
# input {
# stdin { codec => }
# }
#
# or
#
# output {
# stdout { codec => }
# }
#
class LogStash::Codecs::Ucs < LogStash::Codecs::Base
# The codec name
config_name "ucs"
# Append a string to the message
#config :append, :validate => :string, :default => ', Hello World!'
config :format, :validate => :string
config :charset, :validate => ::Encoding.name_list, :default => "UTF-8"
public
def register
@converter = LogStash::Util::Charset.new(@charset)
@converter.logger = @logger
@lines = LogStash::Codecs::Line.new
@lines.charset = "UTF-8"
end # def register
def decode(data)
@logger.info("Data: #{@data}")
@lines.decode(data) do |line|
replace = line.gsub(/[^[:print:]]/,'')
@logger.info("replace: #{@replace}")
yield LogStash::Event.new(replace)
end
end # def decode
# Encode a single event, this returns the raw data to be returned as a String
def encode_sync(event)
end # def encode_sync
end # class LogStash::Codecs::Ucs