How to remove repeting string in message?

I have a syslog5424_msg field which gets repeating JSON string, example log

{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"},
{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"},
{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}

Those are the same logs, I just want the first JSON object and remove the repeating JSON from the field but this repeating string is optional sometimes it occurs sometimes doesn't, how can I handle that?

Use mutate+gsub to wrap it in [ and ], parse the JSON, which will result in an array, then use mutate+add_field with a sprintf reference to the first entry in the array.

is it possible for you to show code? I'm kinda new to ELK

mutate { gsub => [ "someField", "^", "[", "someField", "$", "]" ] }
json { source => "someField" target => "[@metadata][parsedJson]" }
mutate { add_field => { "someOtherField" => "[@metadata][parsedJson][0]" } }

its giving _jsonparsefailure its adding [] to every JSON object, actually these logs are one long string with no newline, I just added newline for better read.

[{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}],
[{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}],
[{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}]

Does it work any better if you change the gsub to this?...

mutate { gsub => [ "someField", "\A", "[", "someField", "\Z", "]" ] }

No it is still giving the _jsonparsefailure

logstash_1       | {
logstash_1       |       "@version" => "1",
logstash_1       |        "message" => "[{\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}, {\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}, {\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}]",
logstash_1       |           "path" => "/usr/share/logstash/sample-log/Test-Log-For-Kibana.log",
logstash_1       |           "host" => "3218b4631024",
logstash_1       |     "@timestamp" => 2020-08-07T18:34:43.677Z,
logstash_1       |           "tags" => [
logstash_1       |         [0] "_jsonparsefailure"
logstash_1       |     ]
logstash_1       | }

Are they separate events or a single event with duplicating lines?
If they are multiple events, what you are asking is de-duplication using "fingerprint" plugin. Example in the link : http://www.kawikao.com/elasticsearch-document-deduplication-with-logstash/

Actually, we have 3 logstash servers listening for logs sometimes they all get same logs so elastic combines all in one field.

That is strange, since if I run with

input { generator { count => 1 lines => [ '[{"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}, {"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}, {"@t":"2020-08-07T15:32:19.7835920Z","@mt":"{@LogDetail}","ThreadId":32,"EventType":2890355315,"Application":"EsXYZ","Environment":"Development"}]' ] } }
filter {
    json { source => "message" target => "parsedJson" }
    mutate { add_field => { "someOtherField" => "%{[parsedJson][0]}" } }
}
output  { stdout { codec => rubydebug { metadata => false } } }

I get a message containing

       "message" => "[{\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}, {\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}, {\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"@mt\":\"{@LogDetail}\",\"ThreadId\":32,\"EventType\":2890355315,\"Application\":\"EsXYZ\",\"Environment\":\"Development\"}]",
"someOtherField" => "{\"Environment\":\"Development\",\"@t\":\"2020-08-07T15:32:19.7835920Z\",\"ThreadId\":32,\"Application\":\"EsXYZ\",\"EventType\":2890355315,\"@mt\":\"{@LogDetail}\"}"

To my eye my message field looks exactly like yours.

When there is a json parse failure there will be an error message in the logstash log telling you why it failed. What does that message say?

My bad its working now, sorry for the confusion

sorry which one worked? can u please mark the correct solution as it will be useful for future

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.