My issue is that filebeat keeps running once it reached the end of the input, waiting for new lines on stdin I suppose. Is there a way to tell it to stop ?
I want to write some scripts to perform the backfilling. This behavior makes it hard to write them. Maybe I should backfill logs another way.
Instead of changing Filebeat to shut down when the write end of a stdin pipe is closed perhaps reading gzipped files should be supported out of the box. Then you wouldn't have to use zcat and the processing of a file would be resumable.
Yes, indeed it might be difficult to create scripts to perform the backfilling as Filebeat doesn't stop. I think it would make more sense to add another input type in Filebeat to backfill some gzip log files. Could you please add a feature request under GitHub?
I'm going to assume this enhancement won't be done for a while though. I should probably come up with another work around in the meantime because I need to maintain the names of the files, which seems to be a problem.
Your zcat idea works great though. I'll definitely be using something similar.
EDIT: Do you know if theres a way to make plugins or something for FileBeat?
Thanks for making this post, it was really helpful for related and unrelated learning.
I'm doing some experimental work to educate myself around this stack. I scp'ed some server log folders down to my machine and have an elk vagrant box. Filebeat is installed on the host machine, and is forwarding the logs to the vagrant box.
I am trying to pump in the exact same files again and again after deleting the logstash indices in elastic. I was pretty sure it worked the first few times I tried it but I don't know for sure. I simply deleted the registry file and restarted the filebeat service.
Should this work? Should deleting the registry file allow me to backfill the logs that are already sent? Must I really cat/zcat and pipe to the filebeat binary?
Absolutely. I have tried this for hours. It did work, then it stopped.
The registry file is written to when the service is shut down, so I wait until that happens before I remove it. However, somehow, filebeat still knows what it has already forwarded when i start it again.
I have tried stopping the service and simply running the binary in order to simplify the experiment. The effect is the same.
Is there definitely no other way for filebeat to know what is sent other than the registry file?
EDIT: A caveman workaround of course is to simply copy all the logs into another folder, update the filebeat config and restart the service. However, the registry being deleted hasn't helped me with this issue.
EDIT2: I ended up reinstalling filebeat and am now getting expected behaviour again. i.e. I delete registry and push all the logs im again. Doesn't make any sense so I guess I must have messed something up while debugging.
filebeat has an option ignore_older set to 24h by default. If file is too old, it will not be processed by filebeat. You can try the nightlies, which set ignore_older to infinite by default and introduce close_older for closing unchanged files.
You can run filebeat with debug output any time: -v -d '*'
absolutely. thanks for the reply. i saw the older than setting but managed to get it all going after copying the logs to a different location and removing registry. Also, weirdly had to reinstall filebeat.
I was tailing the logs on both ends, filebeat and logstash, and watching the registry folder with 'watch ls'.
This could mean a number of things so post of little value to the forum. If it happens again I will try to be more scientific in my approach.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.