How to "drill down" through Filebeat monitoring to identify emergent errors or trends?

tomj · November 13, 2018, 10:09pm

We are currently doing a trial of Elastic Cloud and we're using Filebeat to gather logs from a few hosts. At the moment we're taking some simple performance measurements and (generally) things look positive.

In our production scenario, we'll have about 300 hosts, each running Filebeat (tailing 1 to 3 log files each) and sending data to Elasticsearch.

Suppose that one or more of these 300 Filebeat instances either stops sending data, or encounters output errors, or otherwise misbehaves. Ideally, we'd have some alerts set up to identify such problems almost immediately. At a minimum, we'd like to have some visualizations and dashboards to start with aggregated data and then sift through that in a minute or two.

I don't see a pre-configured way to (for example) look at aggregate output errors across all (or a subset) of Filebeat instances, or to investigate idle or failed instances which are not publishing events at all.

Has anyone else dealt with this challenge? If so, how did you attack it? My naive instinct is to build some custom visualizations by looking at the .monitoring-beats indices, examining the fields reported in there, and doing some queries accordingly.

warkolm · November 14, 2018, 2:10am

There's a few ways to do this.

The first would be to setup some Machine Learning jobs on the logs to watch for unusual patterns (errors, rate changes, etc). From there you can also easily enable Alerting.

The second would be to setup simple thresholds and then do Alerting on those.

tomj · November 15, 2018, 3:21pm

Thanks, I found the fields in the beats_stats structure that are used in the monitoring dashboard.

system · December 13, 2018, 3:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Monitor/alert beats dropped events Beats filebeat , metricbeat	9	1347	November 4, 2022
Monitoring Beats installed server Beats filebeat	1	209	August 10, 2022
Filebeat log monitoring/alerting Beats filebeat	13	6538	July 13, 2016
How can I monitor the queue of Filebeat Beats filebeat	3	1514	May 31, 2021
Monitoring shows details of only one beat instance Elasticsearch	3	698	May 23, 2018

How to "drill down" through Filebeat monitoring to identify emergent errors or trends?

Related topics