Hi,
I'm trying to calculate the mean time between recovery and the mean time between failure of some services monitored by heartbeat. For example, for MTBR, for each service I would like to get the time elapsed between two successive documents with the same monitor.id and having monitor.status down and up respectively. How can I do that?
p.s. I can also do further offline operations once I have obtained the data.
There's not really a great one that you can do in a single query. A prereq for timelines is including the frequency of the check with each message, which will let you calculate a somewhat accurate number for average time down over a period (just the sum of the frequency for all down checks). The timelines PR is more accurate (handling mis-scheduled items) but requires a lot of complex processing in JS.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.