We have a situation where dockerd
will periodically crash on a host, creating issues with jobs running in a timely fashion. Our application team would like to be notified when dockerd
goes down. Unfortunately, we do not have the Docker API configured, so I'm unable to simply put a monitor on localhost:2375/_ping
.
I have observed that when the Docker integration is unable to connect to /var/run/docker.sock
, it generates a specific error message in the docker.*
metrics
data sets.
How can I generate an alert based on this? Frankly, simply alerting on the presence of error.message
would suffice for this use case.
Edit
The ability to monitor a unix socket and simply alert if I were unable to connect to it would work as well (e.g., unix:///var/run/docker.sock
).