Hello,
I'm trying to process multiline messages received from rsyslog as a plain text over TCP with multiline codec in tcp input. In output for some messages i noticed parsing failed because Logstash inserted randomly "\n" inside of the message. Logstash v6.5
Output example:
{"message":"271 <143>1 2018-12-10T07:57:38.891Z [testhost.my.net](http://testhost.my.net) evr 14025 acc - xx.xx.xxx.xxx - - [10/Dec/2018:07:57:38 +0000] \"POST /api/authsessions?currentSessionId=CE6E10EF014314E3DA085023\n478E94AE HTTP/1.1\" 200 - \"requestTimestamp=1544428658868;responseTimestamp=1544428658891;\"","@version":"1","port":xxxxx,"host":"xx.xx.xxx.xxx","@timestamp":"2018-12-10T07:57:38.996Z","tags":["multiline"],"type":"rfc5424"}
Logstash config:
input {
tcp {
host => "0.0.0.0"
port => 20514
type => "rfc5424"
codec => multiline {
pattern => "^%{NONNEGINT}%{SPACE}<%{NONNEGINT}>"
negate => true
what => "previous"
auto_flush_interval => 1
}
}
}
filter {
if [type] == "rfc5424" {
grok {
match => {"message" => "<%{NONNEGINT:syslog_pri}>%{NONNEGINT:version}%{SPACE}(?:-|%{TIMESTAMP_ISO8601:syslog_timestamp})%{SPACE}(?:-|%{IPORHOST:hostname})%{SPACE}(?:%{SYSLOG5424PRINTASCII:env_name}|-)%{SPACE}(?:-|%{SYSLOG5424PRINTASCII:process_id})%{SPACE}(?:-|%{SYSLOG5424PRINTASCII:logtype})%{SPACE}(?:-|(?<structured_data>(\[.*?[^\\]\])+))(?:%{SPACE}%{GREEDYDATA:MSG}|)"
}
}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "test5-%{+YYYY.MM.dd}"
}
}
Rsyslog config
action(Name="logstash-01"
Type="omfwd"
Target="xx.xx.xx.xx"
Port="20514"
Protocol="tcp"
Tcp_framing="octet-counted"
Template="RFC5424"
)
I searched for similar topics but still haven't find the solution