I want to be able to group them by periods. That is, if the timestamp of a log is 30 minutes after the timestamp of the previous log, then it's in a different period.
Hi Tomás, normally you'd want to use Logstash to do something like this. Unfortunately, I just checked with the Logstash team and it looks like Logstash only processes events individually (or sometimes as a batch). Either way, you can't refer to previous events the way you're asking.
However, I think you can group your events according to "time buckets". Say, grouping them by every hour. So you'd still end up with groups (1 and 2), (3 and 4), and (5). Would this help address your requirement?
Would it be possible to somehow create an index from the existing one?
This way, I could:
. Group all the events by ClientIp
. Order by Timestamp
. Sequentially, go through each log and store the timestamp of each log as a field SessionEnd of the aggregation
. Therefore, I could compare the timestamp of each log to (SessionEnd+30minutes) to check whether I should group that log with the existing session or create a new session.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.