Watching watcher

If we are relying on watches to alert us when there is a production issue, then the watches are a critical piece of the infrastructure. As such, if watches stop running (or start timing out) it is a production live site which needs to be immediately addressed. This means we need to be alerted about watches which timeout, fail, or fail to run. Ideally, this notification would be resistant to elastricsearch cluster issues (red, lots of GC-ing, etc.).