Creating Logstash metrics beyond counts, like sum or avg on field values of processed documents

Hi folks,

I have a use case, where I process log files from my CDN provider and want to count requests as well as the sum of bytes (bytes taken from log line) for the different products I have (set of domains or referrer traffic on domains), to finally distribute CDN costs per month to the different products I operate.

As our products create ~200 Mio. log lines per day, I can not just put all log lines for a month in my Elasticsearch cluster and run my queries on that, because I only have resources to store roughly 1 week.

So I have two options I guess. Either set up a cron job that queries my Elasticsearch cluster every night for the previous day, takes the aggregates - counts, sum bytes - per product and writes that to a separate index. When log lines do not get processed properly for the previous day, I run into problems.

Or use the Logstash metrics filter (https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html) and write the created metrics to Prometheus or a database to later calculate the CDN cost per product based on that.
However, and here comes the problem, the metrics filter only seems to be able to meter documents and create counts for that. It does not seem to be able to create aggregates for values of the processed documents, like sum bytes. Is there a way I can achieve that using a Logstash plugin or do I need to write my own plugin for that?

Any advice is appreciated!

Cheers,
Martin

You might be able to do it with an aggregate filter. Something like example 3.

Thanks Badger! I already stumbled upon that one, but did not have the imagination to fit my use case in it I guess. Will try that tomorrow. Hope the timeouts work as expected!

Cheers,
Martin

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.