Missing logs with rotate log


(Amos Shahar) #1

hi,
I am using filebeat (version 1.2.3-1 on AWS linux AMI) to forward to logstash and I have missing logs (every time the file rotates I think).
the specific file is rotating every 20M and in the peak time it is rotating every 1-2 minutes, rotation name is:
filename.log, filename.log.1,filename.log.2 ....
Relevant yaml conf:
paths:
- /my_path/*.log
ignore_older: 24h
scan_frequency: 1s
tail_files: false

logstash:
hosts: ["ls-mydomain.com:4055"]
loadbalance: true

any idea what can be wrong? how to troubleshoot it?
In general, does filebeat can handle log rotate in such load? please note that there are many other files that filebeat is configured to ship but those that have no load are fine.

tried to change the parameters above but it is not solved.

Thanks,
Amos


(Magnus B├Ąck) #2

Exactly how are the files rotated? Are they renamed? Or copied and truncated?


(Amos Shahar) #3

I am not sure. it uses log4j and as I mentioned the names are as follow:
filename.log
filename.log.1
filename.log.2
....

Amos


(Steffen Siering) #4

you glob pattern does not match the renamed files. Thusly filebeat has problems finding these rotated files.


(Amos Shahar) #5

I tried with all patterns ("log.*", ".log", "log*") but still have missing messages.
The only configuration that solve the problem is when I set the file rotation to a very big file (so there is no rotation) and than I get exactly ALL the messages. it seems that filebeat has issues with log rotating in high volume.
anyone any idea?

Amos


(ruflin) #6

It seems like you are hitting this bug here: https://github.com/elastic/beats/pull/1954

Could you try the nightly build to see if this resolves your problem? https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

The problem gets more sever as your scan_frequency is quite low.


(Amos Shahar) #7

Issue has been resolved with this version - Thanks!
filebeat-5.0.0-alpha5-SNAPSHOT-x86_64.rpm

When will you have a stable release with this bug fix?

Amos


(ruflin) #8

Glad it works with the most recent version. The first beta with these changes should be release in the next weeks.


(Amos Shahar) #9

I am sorry but after a hour or two it stopped working again ....
It is a show stopper to the whole project. filebeat forward few logs every minute while I have more than 2000 log lines every minute.
What information do you need in order to help solving this issue?

Amos


(ruflin) #10

Can you post part of your log file? Please set the log level to at least INFO, best would be DEBUG to see all the details.


(Amos Shahar) #11

I can send the log file and the content of some of the files to you but I prefer not to publish as there is customer information involved. do you have an email?

Thanks,
Amos


(Amos Shahar) #12

you can see the filebeat INFO file at:
http://open-voip.org/images/0/08/Filebeat.txt
and the registry file:
http://open-voip.org/images/b/b7/Registry.txt

the problematic file is webSocket.log

Thanks,
Amos


(ruflin) #13

Thanks for sharing some log files here. I had a quick look at the filebeat log and there is nothing really suspicious. It publishes very 30s between 30-60k events which sounds like enough to me to cover your case above.

Can you share the log lines from when you think that not all events are published? Or is that the case with the excerpt you shared? Can you share again the full config that you used for these tests?


(Amos Shahar) #14

here is the conf file:

filebeat.prospectors:
- input_type: log
  paths:
    - /logs/tnet/webSocketEvents.log
    - /logs/tnet/FIX*.log
    - /logs/tnet/tomcatS*.log
    - /logs/tnet/dbPr*.log
  ignore_older: 2m
  fields:
    level: info

I tried few option with ignore_older parameter - all failed

Thanks,
Amos



output.logstash:
  hosts: ["ls-mydomain.com:4055"]

(ruflin) #15

Did you try not to use ignore_older?

Can you share the log lines from when you think that not all events are published? Or is that the case with the excerpt you shared?


(system) #16

This topic was automatically closed after 21 days. New replies are no longer allowed.