Here is some information that is in my logs:
I would like to be able to create alerts any time a log entry includes the word ERROR, and the kubernetes.pod.name includes the string redis-leader
. I have removed the match against ERROR in the log as this works fine, I am only having a problem matching on redis-leader
within kubernetes.pod.name. I am using the log threshold alert type. Here are some things I tried:
This does not work:
WHEN more than or equals 1 log entry
WITH kubernetes.pod.name IS redis-leader-*
FOR THE LAST 5 minutes
This works, but I have to add an alert for every pod, and if the app scales the new pods are unmonitored because the pod gets a fresh name:
WHEN more than or equals 1 log entry
WITH kubernetes.pod.name IS redis-leader-74d59c4b7f-zxz81
FOR THE LAST 5 minutes
Should I be able to use wildcards? I tried a bunch of escaping and quoting, but never got there.
The next thing I would like to do is to add something like a group by, as an alert that tells me that there were five error messages for redis-leaders is less valuable that 3 separate alerts telling me that redis-leaders had messages containing errors. I cannot find a group by in the UI for log thresholds. It would be great if I could use the technique from Create alert per multiple fields