I have messages arriving with very large blocks of broken JSON. These logs fail to parse when they arrive, so I'm using Filebeat to read from logstash-plain.log and feed them back in for extra processing. Each log contains 30+ individual events separated by commas, but the first and last event are getting clipped somewhere in the pipeline. I was hoping to use the mutate plugin to gsub a unique sequence between each event, then use the split function to break them apart into individual messages before sending those back through the JSON parser for processing. I realize the first and last events are unusable, but at least I would get the 30 or so in the middle.
The messages arriving look like this:
tamp\": 124386584932940, \"host\": \"host.sys.name\" , \"name\":\"name\" } , { \"timestamp\": 124386584932940, \"host\": \"host.sys.name\" , \"name\":\"name\" } , { \"timestamp\": 124386584932940, \"host\": \"host.sys.name\" , \"name\":\"name\" } , (repeats 30+ times)
There are commas and spaces all over these logs, so I have to use the entire string } , { to target the comma between events. I need to preserve the brackets on either side so the JSON still works and I need to change the comma to something unique I can use to split later.
To that end, I set up the following code block:
mutate {
gsub => [
"message", "} , {", "}XX.XX{"
]
}
When my logs run through this function, instead of replacing } , { with }XX.XX{ it replaces with }, {
I know that both } and , are special characters in conf files, but when I try to escape those with \ the results are still the same. What's the correct format for this function to work?