Hi,
In filebeat configuration file, there is scan_frequency option. But I am confused about is scan frequency define the scanning new files in the path or scanning updates(new lines) in a file.
If the scan frequency checks the new files, what about the scanning frequency new lines in a file?
the setting actually does multiple things.
First and foremost it is the setting that tells your beat when to check your log directory (lets say its /var/log/*.log) for new files to harvest, you are right about that.
The setting is also used to pick up closed files again.
If the scan frequency checks the new files, what about the scanning frequency new lines in a file?
Your filebeat will start a harvester for each log file you specified. This harvester will catch EVERY new line added to the log. When the harvester does not detect a new line for a specified timeframe (I think the default here is 5 minutes) it will close the file due to inactivity.
Another thing the scan_frequency option does is tell your harvester when to open the file again.
You can read more about it here:
General scan_frequency information:
Merhaba @madduck,
It helps, thank you very much. As I understood, the log file remains open as long as a new line is added. And the new lines are harvested immeliately. Are the lines harvested one by one or as a group?
For filebeat every new line is its own log entry, or rather for syslog based logs. You can make use of the aggregate filter plugin to combine log lines based on certain criteria like for example the AUID
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.