Filebeat does not harvest all logs from application

I encounter an issue, when filebeat does not harvest all logs (only small amount of them) from my app. I have filebeat that harvest logs from docker applicaitons and send them to the Logstash. Applicaiton is running in Docker Swarm.

Index in Elasticsearch has almost 600 docs. In container log dir is 5 files (there is log rotation to 5 files x 5MB), together its almost 20k logs:

cat b239dcd9b90c473da8715b42b486540229176db8dde6620758f205aee8701260-json.log* | wc -l
19267

I have check logstash for logs, there is no error logs about the app. I set logstash to log messages to the stdout. From logs, I can see there is only 124 messages:

kctl logs --since=30m logstash-logstash-0 | grep com_docker_stack_namespace | grep myapp | wc -l
124

None of them are errors, just json logs. Logstash pipeline do just:

  • set index metadata from container name with date
  • parse json messages if possible
  • drop if missing container name - I can see in some logs that this exists, because some logs has been sended to the to right ES index.

I set Filebeat to debug level and check logs. There is no error messages. I check some stats from logs about offsets, etc... and find out:

  • read_offset is chenging
  • size is changing - logs are sending to the log files
  • from Kibana, I can see, logs are sent to ES only once per hour or 2 hours.

enter image description here

From logs there is also:

Check file for harvesting
Update existing file for harvesting, offset: 5021
Harvester for file is still running
State for file not removed because harvester not finished
Remove state of file as its identity has changed
State for file not removed because harvester not finished
Remove state of file as its identity has changed
...
...
...

Last two logs repeated few times and then there is larger log mentioned above (with metadata about all offsets, etc...).

From logs, I can see, there is still offset set to 5021:

{"log":"{\"level\":\"debug\",\"timestamp\":\"2022-03-21T15:09:57.502+0100\",\"logger\":\"input\",\"caller\":\"log/input.go:530\",\"message\":\"Update existing file for harvesting: /var/lib/docker/containers/b239dcd9b90c473da8715b42b486540229176db8dde6620758f205aee8701260/b239dcd9b90c473da8715b42b486540229176db8dde6620758f205aee8701260-json.log, offset: 5021\"}\n",

I tried to restart the filebeat, it was fixed for few minutes and then, this issue occur again.

Resources for Filebeat should be enough. From docker stats:

  • Filebeat cpu usage is on 15%
  • Filebeat memory usage is on 30%.

Why this happenning and how can I solve/debug it?

Versions:
Filebeat - 7.9.3
Logstash - 7.9.2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.