Filebeat slowness issue due to Continuously keeping 200+ open files

Hi team,

Suffering with issue of filebeat (version 7.0.1) slowness.Our application is generating logs in different file locations and after reaching the log file size to 2 gb, it is generating another new file after renaming older one. But due to some reason logs piled up on vm in large amount after that filebeat behaviors changed, now it is always keeping open multiple logs file between 150 to 400 . Now it's not performing well, sending logs too slow and late to logstash. Sometime its sending duplicat events like twice or thrice at the same time and also observed like it sending same event to two logstash at same time while we have enabled load balancing for logstash.
There is no back pressure from logstash hence it’s only filebeat which is creating issue.

Filebeat Configuration:

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /path/to/log/*/*.log

  ignore_older: 24h
  close_inactive: 1m
  close_removed: true
  close_timeout: 6h
  clean_removed: true
#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["<ip1>","<ip2>","<ip3>","<ip4>"]
  loadbalance: true
  worker: 10
  compression_level: 2

Thanks and regards
Rohit

These is most likely related to certain file handler issues that has been resolved in multiple versions after 7.0 (we have 7.12 at this point, so its quite a few releases since 7.0). I do not think there is many plausible ways to resolve this without upgrading, but I will check around.

1 Like

Thanks @Marius_Iversen for your quick response.
I will be waiting for your reply to know if there any way to solve this issue in FB 7.0.1,
and also it twill be great if you can provide some docs/threads to understand the exact cause of this issue.

And is this issue has been resolved in FB 7.9?

1 Like

I mean there is a few things that could be checked, but the reason I usually ask people to upgrade first, is because if we end up troubleshooting for a while, and in find the issue in the end has been fixed in a later release, the only solution at that point would be to upgrade.

Since 7.0.1 was released 2 years ago, and with our frequently release cycle that is a massive amount of updates and bugfixes, it's hard to tell.

Looking at your path configuration here:

  paths:
    - /path/to/log/*/*.log

There is a few things you could do, to confirm/deny certain things. For example does the renamed files also fit this regex? and how are they renamed?
It could be that a file is read, and when moved it will be read again for example?

It could also not be an issue with filebeat at all, you are saying that the VM is getting more logs than before, and at that point filebeat is monitoring more files, which does make sense, as it would monitor all files until close_inactive is met.

It would be hard to find the exact reason at this point, but I would still recommend the upgrade first.

1 Like

Hi @Marius_Iversen,

We are using apache storm which writes in worker.log file only. As @rohitguptaggg explained above storm rotate the files once they reaches the defined limit of 2GB and getting rename as worker_1.log or worker_2.log ...... worker_x.log. These worker_x.log files are not used by the storm for writing as these are rotated file.

But FB works on inode so would file rotation and renaming is causing an issue? Because even the files are rotating and getting renamed causing an issue?

Another question, if we change our pattern to below and if file is rotated(worker_x.log) by storm before it could harvest completely so would it cause the data loss? Or As again it works on inode so FB would read the complete file and stream the data even it gets rotate, hence no data loss?

  • /path/to/log/*/worker.log

Please suggest.

1 Like

@aksadvance, the file rotation and renaming should not really cause any issues, and in general as you said, when the path hits both existing files and the result of a renamed file it should still be fine.

When FB starts its harvester for a file, it is added to the internal registry and should not really be impacted by the renaming either way, however there might always be quirks or niche usecases in which the resulting renamed file might not be recognized anymore, and I just wanted to confirm those usecases.

This could also simply be a very old bug, or a misconfiguration somewhere as well, the questions from my side was simply a way to start the troubleshooting, as we would have to start somewhere :slight_smile:

I think the path is still needed, as the path might be included in the internal registry as well, I would recommend giving it a go, as this is very rarely an issue.

2 Likes

Filebeat 7.0 is EOL as Marius mentions, you'd be best off upgrading to 7.12 and seeing if the issue is resolved, then returning to troubleshooting if not.

3 Likes

@warkolm @Marius_Iversen
Currently we can't upgrade FB as we are in Production.

Please look on below use cases, as per our current case log rotation is very quick, because of heavy log generation on live production.And we are performing some test cases to understanding the issue , for that we need some understanding for below use cases

1. If inode changed and file name remains same

Before restart:
    filename : 9_worker.log
    inode : 12345
After restart:
    filename : 9_worker.log
    inode : 67891

2. IF inode is same and filename is different

Before restart:
    filename : 9_worker.log
    inode : 12345
After restart:
    filename : 21_worker.log
    inode : 12345

#2 After restart of FB, We observed re-sending of complete log for 9_worker.log, so don't know why it resend the logs from 9_worker.log as as it was there in the registry with same inode but different filename.

Open question :
- offset will be used based on inode or filename or both?

Hi @Marius_Iversen , I performed below scenario on filebeat 7.0.1 and 7.12.1 both:

Inode same but filename different.

1st:

1. Filebeat started with log_path pattern as `*.log`
2. Stopped filebeat
3. Renamed file 01_worker.log file to worker.log (To achieve different name and inode same)
4. Filebeat restarted (without any conf changes)
(There was entry for inode and filename in registry)
# Observation: Filebeat started reading logs from previous states.

2nd:

1. Filebeat started with log_path pattern as `worker.log`
2. Stopped filebeat
3. Renamed file worker.log file to 01_worker.log
4. Now filebeat restarted with log_path pattern as `*.log`
(There was entry for inode and filename in registry)
# Observation: Again Filebeat started reading logs from previous states.

3rd:

1. Filebeat started with log_path pattern as `*.log`
2. Stopped filebeat
2. Renamed file 01_worker.log file to worker.log
4. Now filebeat restarted with log_path pattern as `worker.log`
(There was entry for inode and filename in registry)

Observation: But in this case filebeat started reading the file from scratch.

Can You please help here to know,why its not reading from previous states for #3 case???

1 Like

@Marius_Iversen @warkolm team has shared the analysis above and we noticed the same behaviour in 7.01, 7.9 and 7.12.1 versions.

Would you please suggest why filebeat is not considering the inode value in the registry on restart in case when the file got rotated/renamed but inode is still same.

We explained the detailed info above. Or if you can please redirect us to someone who can help on this.
Thanks for your help and support.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.