We are using Filebeat 5.0.0 to parse log files. Log rotate on a daily basis (we are using Log4J's DailyRollingFileAppender to produce our log files). Log rotates every day at 00:00 and we get a truncated event everyday (not necessary first event of the day).
We think this could come from the fact that DailyRollingFileAppender does not seem to create a new file but to copy existing one. Would this be a good clue ? If true, is there some parameters we could set up to avoid this behaviour ?
The issue you linked above should be fixed in the 5.0.0 release.
If log4j uses log truncation instead of log rotation there is always a risk of losing data as there is no guarantee that filebeat has already read the last lines.
What do you exactly mean by truncated event? Can you provide some examples'
We mean that we only get the end of the line for an event (on per rotation) which causes a _grokparsfailure in our forward processing (since there is only a partial line). Am I clearer ?
Partial line is one of first lines of the new file (first lines to the partial line get dropped).
When you say you "only get the end of the line", do you get not the complete event from filebeat or after the processing of logstash? And is the problem related to the multiline events?
I'm trying to understand where your problem is (LS or filebeat). If you just use file or console output, do you still see a problem?
Best would be if you could provide some "real" examples of your partial line.
Sorry to be unclear, problem comes from filebeat. The problem is not related to multiline events.
Here are the examples. Luckily, we have activated filebeat debug logs, so here are our parsed logs and the resulting logs in filebeats.
I'm currently trying to understand what DailyRollingFileAppender exactly does: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/DailyRollingFileAppender.html Based on this page: DailyRollingFileAppender has been observed to exhibit synchronization issues and data loss. The log4j extras companion includes alternatives which should be considered for new deployments and which are discussed in the documentation for org.apache.log4j.rolling.RollingFileAppender. Based on this note I have the suspicion the problem could also be related to what DailyRollingFileAppender does. Any reason you use this one?
Back to the problem: As far as I understand the appender logs everything to the same file but copies over the last day to a separate file. The part I didn't find is when the same file is actually truncated or is it removed and a new one is created?
Well, why do we use DailyRollingFileAppender ? No idea, but we have ~250 apps using it, so that's some king of legacy I get your point on issues with DailyRollingFileAppender but in that case would the data be written in the file ? In our case, they are. The thing is we've been using logstash with the same king of configuration for about 3 years and we did not meet any problem of that kind.
We were suspecting something like this, so we've been watching inodes of logfiles. At the end of the day the file application.log is copied to application.log.TIMESTAMP (both files have the same timestamp) and a new application.log file is created.
That means application.log (old) and application.log (new) don't have the same inode? And also the file copied over to application.log.timestamp has a different inode? I think we are getting closer
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.