Want to calculate and ingest, how much data is sent outside on daily basis. So i am trying to aggregate the logs on the basis of "userid" , sum the "Bytes_out" column on daily basis.
Request code/help on how to achieve this.
That's expected behavior. What you see is the actual logs themselves. They still pass through to the output since you're not dropping them specifically.
The aggregation filter does not alter the original messages themselves, it just creates new ones. You should be seeing aggregated events 1 hour after Logstash started.
More specifically, each new user_id's relevant aggregated event should spawn 1 hour after the first time you receive that user_id.
Hi Paz,
Thanks,it is working ....one more query , i am using .csv file as input, can i use "time stamp" present in .csv file, rather than system time, for time out...
this one is resolved...
for time stamp i am parsing date one more time in aggregation filter.
and for aggregation i am using %{+d} along with user id to aggregate days data. similarly %{+ww} can be used to aggregate weakly data .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.