Log entries gets truncated on file rotation

Hi there,

We are using Filebeat 5.0.0 to parse log files. Log rotate on a daily basis (we are using Log4J's DailyRollingFileAppender to produce our log files). Log rotates every day at 00:00 and we get a truncated event everyday (not necessary first event of the day).

We've found this Filebeat is losing events due to file rotate · Issue #5503 · elastic/logstash · GitHub which seems to be solved in 5.0.0, but we did not find anything in FileBeat's release notes.

Filebeat's configuration looks like this :

filebeat.prospectors:
- input_type: log
paths:
- /var/log/apps/*/application.log
document_type: application
encoding: ISO-8859-1
multiline.pattern: '[1]{4}-[0-9]{2}-[0-9]{2}'
multiline.match: after
multiline.negate: true
backoff: 1ms
max_backoff: 1s
backoff_factor: 1
scan_frequency: 1s
close_inactive: 24h

We think this could come from the fact that DailyRollingFileAppender does not seem to create a new file but to copy existing one. Would this be a good clue ? If true, is there some parameters we could set up to avoid this behaviour ?

Thank you for your help.


  1. 0-9 ↩︎

1 Like

The issue you linked above should be fixed in the 5.0.0 release.

If log4j uses log truncation instead of log rotation there is always a risk of losing data as there is no guarantee that filebeat has already read the last lines.

What do you exactly mean by truncated event? Can you provide some examples'

Thank you for quick answer ;D

We mean that we only get the end of the line for an event (on per rotation) which causes a _grokparsfailure in our forward processing (since there is only a partial line). Am I clearer ?

Partial line is one of first lines of the new file (first lines to the partial line get dropped).

When you say you "only get the end of the line", do you get not the complete event from filebeat or after the processing of logstash? And is the problem related to the multiline events?

I'm trying to understand where your problem is (LS or filebeat). If you just use file or console output, do you still see a problem?

Best would be if you could provide some "real" examples of your partial line.

Hi,

Sorry to be unclear, problem comes from filebeat. The problem is not related to multiline events.
Here are the examples. Luckily, we have activated filebeat debug logs, so here are our parsed logs and the resulting logs in filebeats.

https://drive.google.com/file/d/0By2ZBSjy9s-EQW8wZ3NLd295ajA/view?usp=sharing

In this case, after rotation, first 10 lines of the file gets ignored, 11th line is truncated and the rest of it is OK until next rotation.

Thank you.

I'm currently trying to understand what DailyRollingFileAppender exactly does: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/DailyRollingFileAppender.html Based on this page: DailyRollingFileAppender has been observed to exhibit synchronization issues and data loss. The log4j extras companion includes alternatives which should be considered for new deployments and which are discussed in the documentation for org.apache.log4j.rolling.RollingFileAppender. Based on this note I have the suspicion the problem could also be related to what DailyRollingFileAppender does. Any reason you use this one?

Back to the problem: As far as I understand the appender logs everything to the same file but copies over the last day to a separate file. The part I didn't find is when the same file is actually truncated or is it removed and a new one is created?

Well, why do we use DailyRollingFileAppender ? No idea, but we have ~250 apps using it, so that's some king of legacy :wink: I get your point on issues with DailyRollingFileAppender but in that case would the data be written in the file ? In our case, they are. The thing is we've been using logstash with the same king of configuration for about 3 years and we did not meet any problem of that kind.

We were suspecting something like this, so we've been watching inodes of logfiles. At the end of the day the file application.log is copied to application.log.TIMESTAMP (both files have the same timestamp) and a new application.log file is created.

Is it RollingFileAppender day? Filebeat locks file when logstash is not runnig :slight_smile:

That means application.log (old) and application.log (new) don't have the same inode? And also the file copied over to application.log.timestamp has a different inode? I think we are getting closer :wink:

:smiley:

No,
application.log (old) and application.log (new) have a different inode.
application.log (old) and application.log.timestamp have the same inode.

So the file is not copied over but renamed? Filebeat should not have any issue with this.

Any chance to share the log file of filebeat around the time the rotation happens, best in debug mode.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.