I am using the s3 input plugin in logstash filter to read a streaming log file . This log file being an access log has a retention policy of 60 days on s3 .
When i restart the logstash filter for different changes / maintenance , the input plugin starts consuming all the previous read lines of the file which is huge and unwanted .
Is there a way to maintain the state of the last read line on the input plugin settings like how filebeat keeps the state of each file and remembers the last read line
What do you men "streaming log file"? Unless something has changed, S3 objects are immutable and cannot be appended to.
Because S3 objects are immutable, the S3 Input Plugin's unit of work is a single file. If interrupted in the middle of a file, it will not mark that file as complete, and the next time it starts, it may re-emit some events.
It's also possible that you have the S3 Input configured to not save a record of where it left off. Do you have a sincedb_path directive that is telling Logstash not to keep track of where it left off (such as pointing it to the null device /dev/null)?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.