My understanding is that filebeat will look at the modification timestamp provided by the OS to determine if the file has been modified and then the harvester will try and read from where it left off, correct?
Wouldn't it be a better strategy to compare the offset from the last read to the filesize and see if it has grown, instead of only relying on the OS mod time? This could be an option like:
modification_detection_strategy => [ "filesize", "modtime" ] this would indicate trying both strategies in the order listed.
Since I am using filebeat to monitor log files that simply grow until they roll over, a filesize greater than the filesize of the last time it harvested should signify more new data to harvest.
I don't know JRuby, so if someone can create a patch and explain how to install it using simple OS commands - that would be very helpful.
Interesting, I wasn't aware of this bug. @steffens you could also be interested in this one.
If the harvester is already reading a file, the ModTime is not used and only the offset. Only to decide if a file should be picked up again, a decision is made on the ModTime. Currently the Prospector does not "open" the file to check for the offset, so this would mean a change in logic.
Filebeat is actually written in Golang and not JRuby.
The similar issue was seen in splunk as well, so I am sure filebeat may have that as well. I know splunk resolved the issue by utilizing the timestamp for last read log, instead of the modtime.
Any suggestions on how to resolve this? We are losing events in our set up
What role does ignore_older, close_older play with this? In our case it is set to 10m. And I see that even though the log file is getting written to the moddate is same as yesterday when the file rolled over.
Thanks for the above link. Very interesting. If you mention timestamp, which timestamp do you mean? Based on the thread above, currently the only solution would be that your application that writes the log closes it from time to time.
The config entries for Windows are identical. Just use
Timestamp of the last log read from the log file by splunk. So, even if the modtime is not getting updated, but the last log read was a second ago, splunk considers that the file is being actively written to.
The docs for ignore_older and close_older can be found here: https://www.elastic.co/guide/en/beats/filebeat/1.2/configuration-filebeat-options.html#ignore-older This is mainly about closing the file handler from the filebeat side which is not related if your app closes the file handler.
How do we check if our app does that? We have a webservice running on apache tomcat server.
We implemented, a script to touch the file every 5 minutes.
touch -m filename
I will run a debug mode to compare the transactions sent in the debug logs and actual files.
What role does logstash workers play, here?
We have logstash workers as 3, will that send the logs 3 times? We are also load balancing between 4 logstash nodes, one of them is down, which should not be an issue. I am currently seeing multiple events in Kibana, wondering if that is because of the logsatsh workers in filebeat.yml?
I do not know coding but if , you can code to store the timestamp of the last log read in a variable, and check against that everytime, instead of checking the filetimestamp, it would make more sense?
@logstash_oz Sorry for the late reply. For the timestamp: With this PR (https://github.com/elastic/beats/pull/1703) I introduced a field last_read which stores the timestamp when the file was last read (not last modified). Currently this is not taken into account when starting a harvester, but it definitively gives the opportunity to do so. I'm thinking of doing some comparison of last_read and modTime in the future.
About your LS questions: If you have 3 workers the events should still be sent only once. Filebeat has the at least once principle so it can happen in some cases, that a line is sent more then once. Is it just an edge case that you see events multiple times (for example when a node goes down) or happens that very often?
I decided to go with the strategy of using close_older: 167h
Since the file 'rolls over' every 7 days, I figured that this would keep the harvester going long enough. The only trick is to not have the harvester locking the file when the application rolls the log over and tries to write to a new version of it (on Windows), and to make sure that the roll over is detected correctly. The main file is foo.log and rolls over to something like foo.2016-05-27.log
I still think that, given this problem with the OS, it would be a better design to check the file size and the last read value or have them as policies available.
@logstash_oz@wmcdonald The newest version of filebeat (5.0.0-alpha3) now relies much less on the modification date for harvesting. Could you try out if this version resolves the above issues?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.