We have a situation where
dockerd will periodically crash on a host, creating issues with jobs running in a timely fashion. Our application team would like to be notified when
dockerd goes down. Unfortunately, we do not have the Docker API configured, so I'm unable to simply put a monitor on
I have observed that when the Docker integration is unable to connect to
/var/run/docker.sock, it generates a specific error message in the
metrics data sets.
How can I generate an alert based on this? Frankly, simply alerting on the presence of
error.message would suffice for this use case.
The ability to monitor a unix socket and simply alert if I were unable to connect to it would work as well (e.g.,