Multiline message problem

Hello,

I'm parsing multiline messages to elastic using Logstash. 99,9% of the messages are parsed correctly, but a small part is not. As far as I can see I'm not getting any errors. The messages that are not parsed correctly are available in elastic, but somehow the original timestamp is removed from the message and is given the timestamp of the moment the messages was parsed. Also none of the information in the messages is indexed. Some help would be nice.

script:

input {

file {
path => ["C:/path"]
sincedb_path => "nul"
start_position => "beginning"
mode => "read"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
max_lines => 100000
auto_flush_interval => 1
}
}
} #end input

filter{

fingerprint {
source => "message"
target => "[@metadata][fingerprint]"
method => "MURMUR3"
}

grok{
match => {
"message" => ["%{TIMESTAMP_ISO8601:timestamp}%{GREEDYDATA:rest}"]
}
}

grok{
match => {
"rest" => ['value1%{SPACE}=%{SPACE}%{DATA:field1}\r']
}
}

grok{
match => {
"rest" => ['value2%{SPACE}=%{SPACE}%{NUMBER:field2}']
}
}

grok{
match => {
"rest" => ['value3%{SPACE}:%{SPACE}%{NOTSPACE:field3}']
}
}

if [field1] == "100" {

grok{
  match => {
    "rest" => ['value4.%{SPACE}:%{SPACE}%{GREEDYDATA:field4}']
  }
}

}

grok{
  match => {
    "value5" => ['%{GREEDYDATA:field5}\n']
  }
}

dissect {
mapping => {
'field5' => "%{value6->} %{value7->} "
}
}

ruby {
  code=> "
    mat = event.get('value8').scan(/value9[2-3]([A-Z]+ [0-9]+)/)
    event.set('field10', mat.flatten)
            "
}

} # end if1

date{
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss" ]
timezone => "Europe/London"
target => "@timestamp"
}

if [id_kort] != "value11" and "value12" in [message] {
mutate {
add_field => { "field14" => "%{value13}%{value14}" }
}
}

mutate {
remove_field=> ["field4"]
gsub=> ["field15", " ", ""]
strip=> ["field2", "field3"]
}

} # end filter

output {

elasticsearch{
hosts => "localhost:9200"
document_id => "%{[@metadata][fingerprint]}"
index => "name1"
}

} # end output

logfile:

2019-12-16 08:00:00 message received; id=105
field1: value1
field2: value2
field3:value3
2019-12-16 08:00:01 message received; id =107
field1: value5
field2: value6
field3:value7
2019-12-16 08:00:03 messega received; id =110
field1: value8
field2: value9
field3: value10

Is it possible that messages are being split across events due to the low value for the auto flush interval?

1 Like

That's it! Thanks so much Badger, unlimited Kudo's for you!!! :partying_face: :partying_face: :partying_face: :partying_face: :partying_face: :partying_face: :partying_face: :partying_face: :partying_face:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.