i have different services, sending logs to elastic search and showing in Kibana. Is there any way to find which service is not sending logs to elastic search? from the existing logs, how i can put filter it shows me these service is not sending data? Or is there any dashboard in Kibana which i can use to show down services? i don't want to use the heartbeat feature
I don't think there is a simple solution for this provided in Kibana. We have integrated our monitoring and alerting system (Zabbix *) with Elastic for things like this, but I think the logic could be used with other tools.
I have a query (this example is for winlogbeat) written in python that returns a JSON list of hosts that have sent data in the past 4 hours. It runs via cron on a server and sends this list to Zabbix. Zabbix has a low level discovery feature that takes this list and creates any new items that are needed, so an item is created when a new host is found. If a host isn't found, zabbix will delete the discovered item (based on your option when this is set up) in 7 days.
A similar query runs every 15 minutes that counts the number of logged events for each host and sends those values to Zabbix, again scheduled by cron and uses zabbix_sender.
Now Zabbix has data, alerts are easy if 0 (or some other low value) events are found for a period of time or if a high number of events occur, as needed.
I give the hosts 1 hour before alerting, some of these hosts take awhile to reboot.
I think the general logic is that you will need a way to "discover" what should be sending logs, a place to store that list, and then check current event rates and alert.
* I'm just a user of both products, Zabbix was here first and is integrated with our on-call and notification process. Something similar could probably be done with Elastic Alerter, but that wasn't a wheel that needed to be reinvented for us.
Actually there are a couple ways to do this (although there are some subtitles in my opinion depending on what you are actually trying to accomplish (Service Down vs Just not sending Logs etc)
First and probably easiest is creating a log threshold alert and grouping by your service name or host and set a low threshold.
This will trigger when a host or service log count drops below your threshold from above the threshold.
However if you fire up a service and it never ever sends logs this will not catch it because it needs to see something above the threshold first, in other words the alert does not know your complete list Services/ hosts reporting logs Only those that are reporting and then drop or rise.
This is generally a very good start.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.