How to measure Availability KPIs and downtime from heartbeats?

I've configured heartbeat file to send a heartbeat every 30 seconds, now if i received a beat with monitor_status "down", and the next beat was up.

  • Is there are a field represents the downtime between the beats?
  • what is the information can i get from this field "beat_monitor_duration_us"?

This is a great idea.

  1. I think it'd be great to have each document include total downtime or uptime. It's kinda tricky in a distributed situation however. It would probably depend on querying back to Elasticsearch at least on startup to handle a heartbeat restart or move to another machine. I've opened an issue here to track it. I think we should build it at some point.
  2. The monitor duration tracks how long it took to run the monitor check. That means sending out a request, then receiving downloading and processing the result. Since processing is negligible this maps to the overall performance of your site in most cases.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.