I want to have a constant value available for all the events getting processed by Logstash.
So for example, I want to calculate the unique session IDs based on the Epoch time of the first event. Then subsequent events will be the addition of these base epoch time and the sessions ID which are unique for a day.
There is probably a more direct approach, but logically this is how I imagine it:
You want an epoch time that is unique each day and are shared among all events of that day. So to calculate your base epoch, only take the %d/%m/%Y and discard the rest.
Yes I can do this, but I do not want to do this for every event that comes across in the pipeline. Is there any way to do this once and make the value available for events?
Maybe you can use aggregate filter :
When start event occurs, you compute session I'd, and you store it in aggregate map until the end of the session in the logs.
At the session end (in the logs), you delete the map, using end_of_task=true
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.