Using painless to calculate durations

Mark_Harwood · February 12, 2018, 5:01pm

I'm not a logstash expert but it's worth investigating the issues of robustness around that approach.

If you're relying on an in-memory cache to connect events a1 and a2 and events together then you have to ask yourself these questions:

what happens when the logstash process dies with a buffer full of a1, z1 etc but no a2, z2s to complete them?
what happens if a1 and a2 are very far apart in time? How much RAM do you use to keep [x]1's around waiting for the corresponding [x]2s ? What happens if you purge that RAM?
What happens if one logstash process requires too much in the way of resources to do this join? How do I route events to multiple logstash workers based on ID? (I imagine this is possible but you may need to consider).

Re point 2 - a quick search suggests that by default a1 will hang around indefinitely in RAM waiting for a2 which has previously caused memory issues - see here. Again, I'm not a logstash expert but I assume you'll have to pick between risking overloading RAM or adopting a buffer-ageing policy that can potentially lose data.

The more complex architecture I outline in the video doesn't rely on fallible RAM buffers but may be more work for you to implement.

Topic		Replies	Views
Start and stop events in logfiles Elasticsearch	3	877	July 5, 2017
Elapsed time calculation between different events Elasticsearch	2	3767	July 9, 2019
Show cumulative duration between start / stop events Kibana	5	7742	July 6, 2017
Aggregate? Logstash	11	3824	July 6, 2017
Logstash - Elasticsearch filter unable to fetch start events Logstash	1	472	July 6, 2017

Using painless to calculate durations

Related topics