Determination Filebeat -> Elasticsearch performance


I understand that throughput of filebeat to elastic depends on many things, and before optimization I want to know it's really necessary.

I have a high load app that generates huge amount of logs, so I want to make sure that Filebeat send logs (directly to Elasticsearch) faster than logfile grows. Can I get information about amount of not processed lines in file or delta between processed and total number of lines? Does filebeat provides this kind of metrics out of the box? If not is it right way to compare offsets from registry file with total amount of lines in files for getting this kind of metric?

I think you should be able to get this from the monitoring when viewing filebeat's output in STDout (filebeat -e)

As an example, I think these metrics shown may be of interest (this is all one line, so you may need to clean it up in a JSON formatting tool).

  "filebeat": {
    "events": {
      "added": 944,
      "done": 944
    "harvester": {
      "open_files": 0,
      "running": 0

I didn't find this metrics out of the box, so I implement it by parsing registry file and compare offsets (already processed bytes) with current logfile sizes and it's works more or less accurate.

Now I have information about particular file growth and processing speed. Also this info let me manage file rotation and remove file as soon as they've been read fully and keep "slow" files longer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.