Metric output to elasticsearch via filebeat

umesh2020 · January 28, 2021, 8:01pm

I have a question related to metrics collection. The application writes metric data into a file. Can filebeat pickup this information say using log or filestream input and send it to elasticsearch ? What would be the difference in this case compared to running metribeat to collect this information ?

shaunak · January 29, 2021, 8:14pm

Great question!

Assuming the metrics are written in a structured format in the log file, you can certainly use Filebeat to pick them up and parse them into JSON documents for Elasticsearch to index.

Metricbeat is best for applications with well-known metrics formats, e.g. apache, kafka, mysql, etc. However, you can use it to monitor custom application metrics via something like the http.json metricset (which periodically pulls metrics from your application's HTTP endpoint serving JSON) or the http.server metricset (which allows your application to push JSON-formatted metrics documents to Metricbeat).

Now, to consider the tradeoffs between using Filebeat to read structured metrics from log files vs. using the http.json metricset vs. using the http.server metricset:

With the http.json metricset, Metricbeat will periodically pull metrics from your application. This has two implications:
- If your application's endpoint is occasionally unreachable for some reason or times out, the metrics during that collection instance will be lost.
- If your application's metrics, particularly gauge metrics, change drastically between two collection instances, those intermediate metrics will never be collected.
With the http.server metricset, your application is in control of when to push metrics to Metricbeat so both the disadvantages mentioned above largely go away. However, you could still run into one disadvantage:
- If Metricbeat is occasionally unreachable for some reason, metrics sent during this period might be lost, unless your application implements some kind of retry logic.
With the Filebeat approach using the log or filestream input, your application should be writing logs to a local filesystem and Filebeat should be reading from the same local filesystem. This removes any network-related disadvantages mentioned above. Also, the file acts as a natural buffer, in case Filebeat needs to be restarted for some reason. And your application still gets to control how often to emit metrics to the file, which is good. The only disadvantage as such that you run into is:
- you are now limited by the amount of disk space. But this can also be managed by setting up log rotation on your log files. Filebeat does not perform log rotation but it is able to handle log files that are rotated.

Hope that helps,

Shaunak

umesh2020 · February 2, 2021, 2:58pm

Thanks Shaunak for the details explanation. Appreciate your inputs. Will try out your suggestion

umesh2020 · February 2, 2021, 4:58pm

Hi Shaunak,
My application is using dropwizard to collect metrics and writes to log file. In this case what should be the filebeat config to indicate that it's sending metrics and not just logs ?

Thanks,
Umesh

system · March 2, 2021, 6:58pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Beat or Metricbeat? Beats metricbeat	4	694	January 14, 2018
Parsing metricbeat json log in filebeat -> got processing timestamp, not log timestamp Beats filebeat	3	927	March 28, 2017
Metricbeat Beats	6	2310	January 27, 2017
Metricbeat Data Output through HTTP Beats metricbeat	5	491	February 25, 2020
Metricbeat next to filebeat to same elasticsearch Beats	2	404	June 10, 2018

Metric output to elasticsearch via filebeat

Related topics