I have a URL pointing to a log file. I can write a cron job to download (http get) this log file periodically. But I am just thinking whether filebeat will reset its cursor to the file after download. Also will it send duplicate line to ELK or do I need to remove duplicate on ELK side?
Or is there any other tool or way to do that? Thanks
If you can write a script to download the log file and only append the new lines you should be fine. Actually, if you just overwrite the existing file from the beginning to the end you should get that effect, but when I say overwrite I don't mean curl > file.log because that'll effectively truncate the file first and that'll get you into trouble. The existing file needs to be opened and the current log needs to be overwritten in place.
Filebeat remembers the offset and in case you only append lines, no duplicates are sent. Filebeat has the at least once principle so in case of network problems it could still happen that duplicates are sent, but this is not related to reading the file.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.