We have a diverse multi-purpose cluster. It has over 60 different interfaces that receive data (snmp, syslog, filebeat, logstash, pipelines, perl/powershell/python scripts, ... Each of these are important and we need to know if a particular interface is broken. As you know there are hundreds of moving parts that can break interfaces. disk space, security, patching, passwords, changes, database issues, deletions, upgrades, etc.
Has anyone created a solution to notify them if an index has not received any updated data?
Starting off with something simple would be knowing if an index has not received any data in the last hour.
For starters, I have built a kibana dashboard for each index that shows bar charts or last time of data. This has worked great, but as each year goes by and we create more indexes for more customers, this is becoming un-usable to check each morning (60 charts) and growing.
Requirements: Flexible. Needs to handle unlimited indice(s). Would be nice if you don't have to touch the solution for every data interface that is built.
- Kibana dashboards and visualizations. We already do this and it is becoming painful to look at 60+ charts.
- Write a python script to read a config (list of indexes, name of time field to query, # of minutes to query for staleness). Run the script every x minutes. For starters here, i could use the .kibana index to read list of patterns and get time field if it is defined.
- Write watcher rules. Not fond of writing a rule for each index
Thx for any brainstorming ideas