I've loaded data collected from DropWizard Metrics and am having trouble aggregating.
The following data sample demonstrates the problem. The first minute had 100 requests that averaged 1ms to complete, and the second minute had 1 request that averaged 2000ms.
{
"@timestamp": "2018-03-06T00:00:00.000Z",
"count": 100,
"duration_unit": "milliseconds",
"mean": 1.0,
"metricName": "getUser"
},
{
"@timestamp": "2018-03-06T00:01:00.000Z",
"count": 1,
"duration_unit": "milliseconds",
"mean": 2000.0,
"metricName": "getUser"
}
When I aggregate this in a date histogram to an hourly figure I get a mean of about 1 second ...
(1 + 2000) / 2
... which is not correct.
The correct figure would be (100 * 1 + 1 * 2000) / (101), this reminds me of the re-reduce logic within map/reduce systems.
I've searched the documentation an Google for answers, but no luck. Has anyone else experienced this issue and worked out a solution?
Thanks - Jim Ronan