Get the most recent event among event where have the same ID



My problem is that my logs line are composed with a @timestamp and an ID and I want to add a field to the line where is the most recent among lines with the same ID. There is an example of what I want. If there is only one line with the id, I want to add the field to this line.

Exemple :

@timestamp : May 9th 2018 10:22:15.097 | ID : 1234

@timestamp : May 9th 2018 11:00:00.00 | ID : 1234

@timestamp : May 9th 2018 12:00:00.00 | ID : 5896

@timestamp : May 9th 2018 13:00:00.00 | ID : 1234

If I deal with this for lines, I want to have a new field called "type" with the value "final" to the line number 4 and the line number 3.

This is the result that I want to have :

@timestamp : May 9th 2018 10:22:15.097 | ID : 1234

@timestamp : May 9th 2018 11:00:00.00 | ID : 1234

@timestamp : May 9th 2018 12:00:00.00 | ID : 5896 | Type : Final

@timestamp : May 9th 2018 13:00:00.00 | ID : 1234 | Type : Final

I don't think that logstash is allow us to do that, if not, how can I do that automatically ? For example, check the event every x second and add the "Type" field.


Use an aggregate filter. Look at the documentation for Example #3: No end event.


Thank for your answer. I already tried to use this filter. The problem is that I don't understand how it can solve my problem. In my opinion this filter can add a field to all events with the same ID but not to the most recent event among events with the same ID.


In logstash, you can't know it is the final event for that id until the timeout has expired, at which point the document has long since been indexed. But it is not too late to update that document. You could save the entire document in the aggregate map (exploded into fields). Then when the timeout expires it will create a document based on what is in the map. If you generate your own document ids with a fingerprint filter, then you can regenerate the id and it will re-index and overwrite the document in the index.

I think it is possible to do it, but I haven't tested it.


You do not go into much detail on your use case, but if the question you are asking is "for each distinct value of field x, what is the most recent document in this time range that has that value in field x" then you might want to post that question in the elasticsearch forum.


Thanks for your answer. I understand what you mean but I have a question. That mean that I have to rewrite all my filter that I already wrote but in ruby ? Because the aggregate filter is waiting for ruby code.

(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.