Filebeat skipping events after after restarting to harvest

Bkt · March 19, 2018, 11:38pm

Hello,
I am investigating missing events scenario. The setup is as follows
Linux box running filebeat/ logstash/ elasticsearch and a program which generates the logs. Its on a local drive (maybe we can rule out network drive issue)
The events are there in the log file and hence maybe we can rule out the log rotation variable, since all the events are present in the same file (around the middle of the file)
The filebeat has the following logs around the same time

    2018-03-16T06:46:30-07:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=4 filebeat.harvester.running=4 filebeat.harvester.started=4 filebeat.prospector.log.files.truncated=1 libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=12 libbeat.logstash.publish.write_bytes=198152 libbeat.logstash.published_and_acked_events=2113 libbeat.publisher.published_events=2113 publish.events=2117 registrar.states.update=2117 registrar.writes=2
    2018-03-16T06:47:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:47:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:48:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:48:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:49:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:49:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:50:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:50:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:51:00-07:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.closed=1 filebeat.harvester.open_files=-1 filebeat.harvester.running=-1 publish.events=1 registrar.states.update=1 registrar.writes=1
    2018-03-16T06:51:10-07:00 INFO File is inactive: /scratch/apps/deployed_versions.log. Closing because close_inactive of 5m0s reached.
    2018-03-16T06:51:30-07:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.closed=4 filebeat.harvester.open_files=-4 filebeat.harvester.running=-4 publish.events=4 registrar.states.update=4 registrar.writes=2
    2018-03-16T06:52:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:52:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:53:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:53:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:54:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:54:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:55:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:55:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:56:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:56:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:57:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:57:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:58:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:58:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:59:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T06:59:30-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T07:00:00-07:00 INFO No non-zero metrics in the last 30s
    2018-03-16T07:00:10-07:00 ERR Failed to publish events caused by: write tcp 127.0.0.1:42938->127.0.0.1:5044: write: connection reset by peer
    2018-03-16T07:00:10-07:00 INFO Error publishing events (retrying): write tcp 127.0.0.1:42938->127.0.0.1:5044: write: connection reset by peer

so when it starts to harvest it missed a few lines (took the first 8 lines but the next 9 lines were skipped.) I am not too sure what could have go off, but similar pattern is noticed with other files which are not written very frequently.

ps: not too sure how can I add the logs in a better way in this post.

ruflin · March 21, 2018, 7:08am

Can you share your config file and the version of Filebeat you are using?

How did you check that you lost the log lines? In the above there is a sending error, so it could be that the events are still queued on the Filebeat side.

For the formatting I recommend to you 3 backticks before and after the log / config files you copy here.

Bkt · March 21, 2018, 5:58pm

Thank you @ruflin for taking time to read / reply, I formatted, as you suggested with backticks, and is much more readable now.
the filebeat config file is as below

filebeat.prospectors:
- input_type: log
  paths:
  - /scratch/apps/*.log

output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

the version is filebeat-5.6.3-linux-x86_64

with regards to the

Blockquote
In the above there is a sending error, so it could be that the events are still queued on the Filebeat side.

I posted the question after around 2 days of the logs (not sure if filebeat would keep trying over two days, if it does its good) I had manually checked for the missing entries because I expected there to be 17 entries (we have a group of 17 entries but when I saw only 8 I looked at the actual logs to verify the entries were indeed present in the logs but absent in when searched via kibana). I verified right now and the events are still missing

ruflin · March 26, 2018, 9:12am

Filebeat will keep retrying until it succesfully sent the events.

As you send through LS I'm wondering what processing you do on the LS side and if the event could be lost there?

To verify that Filebeat ships all log lines you could enable file output and check if all events show up there.

One thing that could also happen that you don't see the events in Kibana because the timestamp is not in the range you are looking at. Perhaps use a count query in ES to verify how many events you have.

Bkt · March 30, 2018, 7:59pm

Thanks Ruflin. I will check if the events are lost at logstash, (though I doubt it they succeed if I re-parse the log file from filebeat) by enabling the file output too,
BTW the logstash part is GROK filters and JSON filters

system · April 27, 2018, 8:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat missing some log lines Beats filebeat	4	2071	July 5, 2017
Filebeat is losing events due to file rotate Beats filebeat	5	2484	July 1, 2016
Sometimes filebeat stops collecting/flushing events Beats filebeat	3	293	April 1, 2021
Filebeats Deleted files and potentially missing data Beats	1	755	August 31, 2017
Filebeat is not sending new events to logstash Beats filebeat	1	278	November 30, 2021

Filebeat skipping events after after restarting to harvest

Related topics