Generate alert for error message in metrics dataset

We have a situation where dockerd will periodically crash on a host, creating issues with jobs running in a timely fashion. Our application team would like to be notified when dockerd goes down. Unfortunately, we do not have the Docker API configured, so I'm unable to simply put a monitor on localhost:2375/_ping.

I have observed that when the Docker integration is unable to connect to /var/run/docker.sock, it generates a specific error message in the docker.* metrics data sets.

How can I generate an alert based on this? Frankly, simply alerting on the presence of error.message would suffice for this use case.

Edit

The ability to monitor a unix socket and simply alert if I were unable to connect to it would work as well (e.g., unix:///var/run/docker.sock).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.