Watcher repository by metricbeat module (kubernetes, system)

Hi,

What do people here think of creating a repository to store default (or example) Watcher alerts for the kubernetes and/or system module? (I'm also in favor of other modules but we should pick somewhere to start).

My hunch is that most teams would want a similar set of basic alerts, and that most teams don't have the best coverage so far. Combining efforts would be very helpful.

Best,
Justin

Hello,

There are some sample watches in this repository:

If none of the samples is what you're looking for, perhaps we can include some additional sample watches.

Thanks.

Hi @Michael_Madden, thanks for the reply.

I've seen this repo. I guess to be more specific, I'm suggesting we create a full suite of alerts that teams can apply to get a base level of monitoring for each host running metricbeat.

Alarms for CPU, memory, filesystem (system module). Then things like pending pods, pods in a crash loop, pods using more resources than requested (kubernetes module).

Ideally one would just run make apply and we'd http PUT each watch.json to the cluster.

kube-prometheus does something very similar for the prometheus/grafana stack.

Just to follow up here... does that sound reasonable?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.