Multiline events based on identical fields

Hi all,

I have a not so straight forward request of a user to process progress logfiles. The catch is that the want the log files multilined in a specific pattern.
All lines that have the exact same timestamp, P-id and T-id should be in a single doc. As you can see from the example below, there are wild combinations of ID's and timestamp. The bold lines for example should be in 1 doc.

[18/11/12@15:08:49.357+0100] P-035768 T-2833205120 2 AS AS Application Server connected with connection id: REDACTED

[18/11/12@15:08:49.376+0100] P-035768 T-2833205120 1 AS -- Log entry types activated: Db.Connects:2

[18/11/12@16:22:06.352+0100] P-040789 T-2835367808 1 AS -- Logging level set to = 2

[18/11/12@16:22:06.354+0100] P-040789 T-2835367808 1 AS -- Log entry types activated: ASPlumbing,DB.Connects
[18/11/12@16:22:06.354+0100] P-040789 T-2835367808 2 AS AS Starting application server for REDACTED. (5560)
[18/11/12@16:22:06.354+0100] P-040789 T-2835367808 2 AS AS Application Server Startup. (5473)

[18/11/12@16:22:06.434+0100] P-040789 T-2835367808 2 AS CONN Database master Options: (12699)

[18/11/12@16:22:06.435+0100] P-040789 T-2835367808 2 AS CONN Connected to database master, user number 88. (9543)

[18/11/12@16:22:06.442+0100] P-040789 T-2835367808 2 AS CONN Database temp-db Options: (12699)

I was thinking of sending the log files with beats to a specific pipeline in logstash with a single worker and use the aggregate filter but I can't wrap my head around where to start.
Is this even possible with the built-in logstash filters?

You should be able to do that with an aggregate filter. If you have already parsed the timestamp, P-id, and T-id you can combine them as the task_id using sprintf references

task_id => "%{timestamp} %{P-id} %{T-id}"

or else dissect them off

dissect { mapping => { "message" => "%{task_id} %{+task_id} %{+task_id} %{}" } }

Then configure the aggregate filter to flush after a timeout, as in example 3.

In the code option, either create an array

code => '
    map["messages"] ||= []
    map["messages"] << event.get("message")
'

or concatenate them. Whatever works for you.

1 Like

Thanks for the directions! I'll try that.

I finally got the aggregation to "sorta" work with the following:

aggregate {
task_id => "%{progress_timestamp} %{progress_pid} %{progress_thread}"
code => "
map['progress_message'] ||=
map['progress_message'] << event.get('message')
"
push_previous_map_as_event => true
timeout => 10
}

It does seem to trip on the timestamp though since I see logs with different timestamps getting aggregated:

[20/06/09@13:25:06.126+0200] P-082061 T-2238457728 1 AS QRX-DEBUG REDACTED, [20/06/09@13:25:06.126+0200] P-082061 T-2238457728 1 AS QRX-DEBUG START HandleRequest: , [20/06/09@13:25:06.289+0200] P-082061 T-2238457728 1 AS QRX-DEBUG ------------------------------------------------------------------------------------------------------------------------------, [20/06/09@13:25:06.289+0200] P-082061 T-2238457728 1 AS QRX-DEBUG ], "info":{"changesOnly":

Looking at the debug logs, I can see this. Is this because the timestamp varies and the P and T values do not or is there another issue?

[2020-06-09T13:27:00,329][DEBUG][logstash.filters.aggregate] Aggregate create_timeout_event call with task_id '%{progress_timestamp} P-124198 T-242648960'

@Badger just FYI.

with the following aggregate block it seems to be solid.

aggregate {
  task_id => "%{progress_raw_timestamp} %{progress_pid} %{progress_thread}"
  code => "
    map['progress_message'] ||= []
    map['progress_message'] << event.get('message') +10.chr
    map['pid'] ||= event.get('progress_pid')
    map['thread'] ||= event.get('progress_thread')
    map['loglevel'] ||= event.get('progress_loglevel')
    map['environment'] ||= event.get('progress_environment')
    map['host'] = event.get('host')
    map['agent'] = event.get('agent')
    map['log'] = event.get('log')
  "
  push_map_as_event_on_timeout => true
  timeout_tags => ['_aggregatetimeout']
  timeout => 10
}
if "_aggregatetimeout" not in [tags]
{
  drop {}
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.