Might be linked with the discussion from Aggregate data based on some field:
Hi everyone,
I have incoming logs in Logsash looking like these:
id1 valueA, id1 valueB, id1 valueC, id1 valueD, id2 valueAA, id2 valueBB, id2 valueCC, id2 valueDD, id2 valueEE, id2 valueFF, id3 valueAAA, ....
- Logs come by group, with the same id, and once it's done, a new group begin and there's no risk to receive a new log from a previous group again.
- Each group can have any number of logs.
- There is no "flag" like "END" or "Complete" to indicate it's the last log of the group.
In the end, my goal is to obtain elasticsearch docs like this:
{ "group_id" : "id1", "values" : [valueA, valueB, valueC, valueD] }, { "group_id" : "id2", "values" : [valueAA, valueBB, valueCC, valueDD, valueEE, valueFF] }, { "group_id" : "id3", "values" : [valueAAA, ...] }, ...
How can I proceed?
I bet I have to use the aggregation plugin, but all examples I found in both the doc and this forum had a "END" flag in the last log to trigger the end_of_task => true
.
I guess I have to store the value of the id somewhere, since I will only know that group 1 in done when group 2 enter the pipe?
I may append the values of group 1 in an array like this :
code => "map['value_array'] ||= [] ; map['value_array'].push(event['current_value']) ;"
, then create an ES doc with stored values in the elasticsearch ouput plugin when current_id != last_id
(and there is still a problem with the last group of all) but I don't know if "ifs" must be inside or outside the aggregation plugin, nor how to write this in ruby.
Did someone encoutered something similar and can show what the structure would be?
Thanks for your thougths!
PS: Sorry for mistakes if there is any. English is not my native tongue.