Some advice?

Hi, we have +300 servers that we have enrolled with a fleet-policy. We would like to rollout monitors to these servers to check on uptime of local services. However, we have this going now for 1 service, and it is a bit problematic: since it tests on localhost:someport that is also what is reports, which makes it inconvenient to find out on which host this service actually went down. Any advice on how to do this better?

Generally people would approach this by having a separate policy for a single box that does all the monitoring and has one monitor configured for each server. That way it'd be clear which one broke.

If servers monitor themselves monitoring breaks if the box dies.

Take a look at project monitors which can be programmatically configured via CLI to help in automating this.

1 Like

The problem is that all these subject hosts are out of our reach in terms of networking. So monitoring them all from one single box is not an option, hence the locally deployed APM server in the fleet-policy. But indeed, I might look into the CI/CD project setup.