Hello everyone,
I have a log that generates thousands of events a day, a group of events and its entire plot can be differentiated and obtained by the KEY field generated by the application as in the following example.
DATE | EVENT | KEY |QUEUE
---------------------------------------------------------------------------------------------------------------
2020-04-09T10:54:54 EveIni 202004091554541340000000 App1
2020-04-09T11:48:10 EveGet 202004091554541340000000 Verify
2020-04-09T11:48:14 EveCon 202004091554541340000000 Verify
2020-04-09T11:58:21 EveGet 202004091554541340000000 UserVerify
2020-04-09T11:58:21 EveCon 202004091554541340000000 UserVerify
2020-04-09T13:58:22 EveGet 202004091554541340000000 App2
2020-04-09T13:58:22 EveCon 202004091554541340000000 App2
2020-04-09T13:58:22 EveEnd 202004091554541340000000 Process
What is required is for a KEY to group all its messages (EVENT + QUEUE) into a single elasticsearch document as below, using Logstash.
Date |Key |Msj1 |Msj2 |Msj3 |Msj4 |Msj5 |Msj6 |Msj7 |Msj8
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2020-04-09T13:58:22 202004091554541340000000 EveIni-App1 EveGet-Verify EveCon-Verify EveGet-UserVerify EveCon-UserVerify EveGet-App2 EveCon-App2 EveEnd-Process
What I have already tried is to use the KEY column as document_id but the final result is based only on the last event, that is:
Date |Key |Msj1 |Msj2 |Msj3 |Msj4 |Msj5 |Msj6 |Msj7 |Msj8
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2020-04-09T13:58:22 202004091554541340000000 EveEnd-Process
I hope you can help me with ideas on how to solve this need. Thank you.
There are 3 possible approaches:
- use the aggregate filter to group up the documents the events having the same key in Logstash. The drawback is you can use only one worker.
- use a scripted upsert to create documents having as
_id
the KEY and fields get added over time. To cope with potential documents coming out of order, it will be necessary to keep also the timestamp stored somewhere within the document - ingest every line as a single document and use a Transform Job to continuously create a per-key document with a scripted metric and a group by key
Thanks Luca,
I chose the scripter upsert option and it works great.
1 Like
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.