We have Filebeat set up to collect logs from AWS and then forward them to Log Stash.
Filebeat is installed on a Windows server set up as an EC2 instance in the AWS cloud. The problem is that if we rebuild the EC2 instance, the Filebeat has to start from the beginning and it takes a while for it to get caught back up. Where exactly is the Read Pointer that Filebeat uses to keep track of where it had last read in the AWS logs?
Is there some documentation on the Filebeat Read Pointer? and how you can move that to a new Windows instance.
It is the registry, it is per default in the Filebeat data directory.
I'm not sure where this is in windows because I do not use windows, but you can change it with the setting filebeat.registry.path: desired-path in filebeat.yml, as mentioned in the documentation.
But I'm not sure this would solver your issue, where are the files you are reading stored? Filebeat tracks the lines read by the inode, if you are recreating the machine all the inodes would also change.
I'm reading AWS S3 buckets, and in some cases there are about 3 years worth of data. It'll go through and get caught up, but then if If I have to start a new instance it then has to read through 3 years worth of data to get caught up again.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.