How to break down a multiline entry with nested json - File input plugin

Hi All,

I have the following multiline log in json.
My setup is that I read from a file (where my events are in embedded json) and the events gets broken down (each row is a treated as a single entry) wrongly.
How can I tell logstash that one entry starts with a { and ends with a }?
As you can see there are some {} in between but they are indented so if I can somehow tell to logstash to mind oif there are spaces before the {?
I was thinking to use the delimiter properties of the file input plugin but this will still break down the entries?

     "metadata": {
         "customerIDString": "aa0a01be755244378d9aeff317ae3876",
         "offset": 43852,
         "eventType": "ScheduledReportNotificationEvent",
         "eventCreationTime": 1648080282000,
         "version": "1.0"
     "event": {
         "UserUUID": "1fccffe5-8965-4889-9a9a-e020024aae1a",
         "UserID": "",
         "ExecutionID": "514f482c31624a408e683c9f9e452d25",
         "ReportID": "65b7b69e2e9e4d1fa946d0d1d5db14e4",
         "ReportName": "Container usage report",
         "ReportType": "dashboard",
         "Status": 2,
         "StatusMessage": "Dashboard not found"

You can use a multiline code to combine everything up to a ^} as a single event. You may need to use the auto_flush_interval option.

1 Like

it worked @Badger.
However, could you please elaborate a bit on the auto_flush option.
Currently, my config works but now and then it pops up the following INFOs in the log file and nothing happens (The last three lines.)

[2022-04-26T18:35:47,914][INFO ][logstash.javapipeline    ] Starting pipeline {:pipeline_id=>"crowdstrike", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["/etc/logstash/conf.d/crowdstrike.conf"], :thread=>"#<Thread:0x7a198858@/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:54 run>"}
[2022-04-26T18:35:49,064][INFO ][logstash.javapipeline    ] Pipeline Java execution initialization time {"seconds"=>1.15}
[2022-04-26T18:35:49,678][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/var/lib/logstash/plugins/inputs/file/.sincedb_ddb72ae539c9547189b77bcc700407f6", :path=>["/var/log/crowdstrike/falconhoseclient/output"]}
[2022-04-26T18:35:49,701][INFO ][logstash.javapipeline    ] Pipeline started {""=>"crowdstrike"}
[2022-04-26T18:35:49,766][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2022-04-26T18:35:49,837][INFO ][] Refreshing Ingestion Resources
[2022-04-26T19:35:49,837][INFO ][] Refreshing Ingestion Auth Token
[2022-04-26T19:35:50,139][INFO ][] Refreshing Ingestion Resources

I cannot speak to the messages. I have no idea what they are from or about.

If you have a file like

"someField": "a"
"anotherField": 1

and use a configuration of the multiline codec like

codec => multiline {
    pattern => "^{"
    negate => true
    what => "previous"

then it will combine a line that starts with { with every line until the next line that starts with {. That means it will only flush the { "someField": "a" } event, because there is not a third { to cause the { "anotherField": 1 } event to be flushed. auto flush will flush the second event based on a timeout.

1 Like

Thank you very much @Badger :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.