I am trying to create a Kibana alert for a syslog that is captured by logstash and minimally parsed. The format is this:
<syslog_priority>process_name[pid]: message
The "message" portion is not parsed further, but it contains information which identifies the IP availability status in a network node. When the Kibana alert is triggered, I need this message to be displayed to identify where the issue is happening.
I created a Log Threshold alert which reads:
WHEN THE count OF LOG ENTRIES
WITH program IS ip_availability
IS more than or equals 1
FOR THE LAST 5 minutes
The above is set to check every 1 minute, and to notify only on status change. It works as expected - the resulting message indicates that 'n' log entries match the condition. I tried printing {{context}}
, but it doesn't seem to contain any data from the events that triggered the alert. My first question is, is there some way to access the log entries' data?
I have also attempted a workaround, which more or less works. I have added a GROUP BY message.keyword
to the alert config above. This does cause each event with a different message to create its own alert group. Then, I set the message template as such: {{alertName}} - {{context.group}}
. As a result, {{context.group}}
holds the syslog message which triggered the alert, and each alert displays the original event which caused the alert.
However, there are a lot of glitches happening with this workaround. As soon as I enable GROUP BY message.keyword
, the preview for the condition WITH program IS ip_availability
in the Edit Alert screen stops working. I intermittently get timeout errors in the alert screen. The alerts themselves come through, but sometimes I get the alerts with a delay of 5 minutes, or even 30 minutes. Sometimes I get duplicate alerts, and sometimes I get no alert at all. I am guessing using "keyword" on a syslog message is a bad idea.
I apologize in advance if I am missing something silly, but can I please get some advice on the proper way to set this type of alert? Or, otherwise, how to fix/alleviate the problems with my GROUP BY workaround, or what I am doing wrong with it? Thank you in advance.