Use logstash to transform a message

Hello everyone,

I have a log that generates thousands of events a day, a group of events and its entire plot can be differentiated and obtained by the KEY field generated by the application as in the following example.

DATE 				| EVENT	      | KEY					   	   |QUEUE	
---------------------------------------------------------------------------------------------------------------
2020-04-09T10:54:54  EveIni        202004091554541340000000 	App1
2020-04-09T11:48:10  EveGet        202004091554541340000000 	Verify
2020-04-09T11:48:14  EveCon        202004091554541340000000 	Verify
2020-04-09T11:58:21  EveGet        202004091554541340000000 	UserVerify
2020-04-09T11:58:21  EveCon        202004091554541340000000 	UserVerify
2020-04-09T13:58:22  EveGet        202004091554541340000000 	App2
2020-04-09T13:58:22  EveCon        202004091554541340000000 	App2
2020-04-09T13:58:22  EveEnd        202004091554541340000000 	Process

What is required is for a KEY to group all its messages (EVENT + QUEUE) into a single elasticsearch document as below, using Logstash.

Date					|Key						|Msj1			|Msj2			|Msj3			|Msj4				|Msj5				|Msj6			|Msj7			|Msj8						
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2020-04-09T13:58:22		202004091554541340000000	EveIni-App1		EveGet-Verify	EveCon-Verify	EveGet-UserVerify	EveCon-UserVerify	EveGet-App2 	EveCon-App2		EveEnd-Process


What I have already tried is to use the KEY column as document_id but the final result is based only on the last event, that is:

Date					|Key						|Msj1			|Msj2			|Msj3			|Msj4				|Msj5				|Msj6			|Msj7			|Msj8						
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2020-04-09T13:58:22		202004091554541340000000																															EveEnd-Process


I hope you can help me with ideas on how to solve this need. Thank you.

There are 3 possible approaches:

  • use the aggregate filter to group up the documents the events having the same key in Logstash. The drawback is you can use only one worker.
  • use a scripted upsert to create documents having as _id the KEY and fields get added over time. To cope with potential documents coming out of order, it will be necessary to keep also the timestamp stored somewhere within the document
  • ingest every line as a single document and use a Transform Job to continuously create a per-key document with a scripted metric and a group by key

Thanks Luca,

I chose the scripter upsert option and it works great.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.