Assuming I have a file like this:
start k1 v11
other lines
end k1 i12
other lines
other lines
start k1 v21
anything else
end k1 i22
anything else
anything else
anything else
start k1 v31
anything else
anything else
end k1 i32
I'd like to get events like:
k1 v11 i12
k1 v21 i22
k1 v31 i32
i.e. join start / end pairs of lines and extract kA and vBC from start and iDE from end.
With this config:
filter {
grok {
match => { 'message' => '^start (?<s>\w+) +(?<v>.*)' }
add_field => { 'type' => 'v' }
}
if (! [type]) {
grok {
match => { 'message' => '^end (?<s>\w+) +(?<i>.*)' }
add_field => { 'type' => 'i' }
}
}
if (! [type]) {
drop {}
}
if ([type] == 'v') {
aggregate {
code => "map['v'] = event.get('v')"
map_action => 'create'
task_id => '%{s}'
}
drop {}
}
if ([type] == 'i') {
aggregate {
code => "event.set('v', map['v'])"
end_of_task => true
map_action => 'update'
task_id => '%{s}'
timeout => 60
}
}
}
I'm getting this:
k1 v11 i12
k1 %{v} i22
k1 %{v} i32
I assume because keys are the same, all but the first event is getting its value dropped. Any way to fix?
I wanted to avoid use multiline codec here because the spacing between events can be huge (i.e. the number of other lines and anything else can be huge) and also because there can be things embedded into other lines that make the regexp to filter them out harder to write.