Logstash Pipeline Filter (*)

max_cyril · February 15, 2022, 3:42pm

RE:
Hi,I am struggling to write a logstash pipeline filter section

i have used "TRANSFORMS" from kibana interface, it did not solve the case.

*** i want to filter and only output event that correspond to the maximum of completion per users ***

for instance :


n°   |  user |   completion

05   |  u2   |     20
 ... |  ...  |    ...
423  |  u48  |    100
424  | u49   |     -
425  | u50   |     0

here my trial:

input
{elasticsearch
{hosts => "...."
user => "..."
password => "..."
index => "index1"
codec =>"json"
docinfo => true
}}



filter {
aggregate {
task_id => "%{users}"
code => "map['completion'] = event.get('completion') ;
event.cancel if (map['completion']) != map['completion'].max()"
map_action => "create" }
}



output
{elasticsearch
{hosts => "..."
user => "..."
password => "..."
index => "index2"
document_type =>"%{[@metadata][_type]}"
document_id =>"%{[@metadata][_id]}"
}}

Thanks in advance.

Badger · February 15, 2022, 5:12pm

That requires logstash to look into the future and predict what events will occur after the current event. Tricky.

I think the best you can do is to use push_map_as_event_on_timeout.

aggregate {
    task_id => "%{users}"
    code => '
        map["completion"] ||= 0
        c = event.get("completion")
        if c > map["completion"]
            map["completion"] = c
        end
        event.cancel
    '
    push_map_as_event_on_timeout => true
    timeout_task_id_field => "users"
    timeout => 600 # 10 minutes timeout
}

Note that the event that is created will only contain the fields that you add to the map, so in this case it will have [completion] and [users] (because timeout_task_id_field is set). If you have other fields you want to preserve then add them to the map.

max_cyril · February 16, 2022, 11:00am

thank you @Badger ,
the output is actualy something like this:

users     |     completion
   -      |      number
   -      |      number
   u49    |       -
   -      |      number

users field is no longer outputed ,therefore we can not know for which users the field completion correspond.

Badger · February 16, 2022, 6:40pm

If the [users] field is only output when it changes then you could use a ruby filter to add it back in. You need order preserved, so make sure that pipeline.workers is 1 and pipeline.ordered evaluates to true.

ruby {
    init => '@user = nil'
    code => '
        user = event.get("users")
        if user
            @user = user
        else
            event.set("users", @user)
        end
    '
}

system · March 16, 2022, 6:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash pipeline filter Logstash	8	487	February 9, 2022
Logstasch pipeline filter Logstash	2	280	February 8, 2022
Logstash aggregation filter Logstash	7	288	September 9, 2020
Aggregate filter timeouts and input completion Logstash	4	723	July 12, 2021
How to aggregate multiple events into single output Logstash	5	1263	August 18, 2020

Logstash Pipeline Filter (*)

Related topics