Get the most recent event among event where have the same ID

Pierre2 · June 20, 2018, 9:34am

Hello,

My problem is that my logs line are composed with a @timestamp and an ID and I want to add a field to the line where is the most recent among lines with the same ID. There is an example of what I want. If there is only one line with the id, I want to add the field to this line.

Exemple :

@timestamp : May 9th 2018 10:22:15.097 | ID : 1234

@timestamp : May 9th 2018 11:00:00.00 | ID : 1234

@timestamp : May 9th 2018 12:00:00.00 | ID : 5896

@timestamp : May 9th 2018 13:00:00.00 | ID : 1234

If I deal with this for lines, I want to have a new field called "type" with the value "final" to the line number 4 and the line number 3.

This is the result that I want to have :

@timestamp : May 9th 2018 10:22:15.097 | ID : 1234

@timestamp : May 9th 2018 11:00:00.00 | ID : 1234

@timestamp : May 9th 2018 12:00:00.00 | ID : 5896 | Type : Final

@timestamp : May 9th 2018 13:00:00.00 | ID : 1234 | Type : Final

I don't think that logstash is allow us to do that, if not, how can I do that automatically ? For example, check the event every x second and add the "Type" field.

Badger · June 20, 2018, 2:20pm

Use an aggregate filter. Look at the documentation for Example #3: No end event.

Pierre2 · June 21, 2018, 6:55am

Thank for your answer. I already tried to use this filter. The problem is that I don't understand how it can solve my problem. In my opinion this filter can add a field to all events with the same ID but not to the most recent event among events with the same ID.

Badger · June 21, 2018, 10:22pm

In logstash, you can't know it is the final event for that id until the timeout has expired, at which point the document has long since been indexed. But it is not too late to update that document. You could save the entire document in the aggregate map (exploded into fields). Then when the timeout expires it will create a document based on what is in the map. If you generate your own document ids with a fingerprint filter, then you can regenerate the id and it will re-index and overwrite the document in the index.

I think it is possible to do it, but I haven't tested it.

Badger · June 22, 2018, 1:04am

You do not go into much detail on your use case, but if the question you are asking is "for each distinct value of field x, what is the most recent document in this time range that has that value in field x" then you might want to post that question in the elasticsearch forum.

Pierre2 · June 22, 2018, 6:45am

Thanks for your answer. I understand what you mean but I have a question. That mean that I have to rewrite all my filter that I already wrote but in ruby ? Because the aggregate filter is waiting for ruby code.

system · July 20, 2018, 6:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Only keep the most recent line when the id is the same Elasticsearch	4	374	June 1, 2018
Using aggregate to add data to previous event Logstash	3	994	November 7, 2019
Aggregate filter to remember field accross lines Logstash	3	505	July 8, 2019
Calculate time between different events with same ID Logstash	1	299	May 31, 2020
Logstash Combine two fields of different documents based on another field (without an ID) Logstash	3	445	May 29, 2020

Get the most recent event among event where have the same ID

Related topics