I need to exlude some log files which were created a day ago. Or else in simple words, I need to include only today's log file in Filebeat. In my path, I have log files named as user_log.2019.06.29 and in that folder I have 1 month's data.
So, please suggest me how to take today's log file. Can you please help me to understand a setting available in Filebeat called ignore_older can can this help me achieving the same.
@Juanma , I understood the concept. But suppose I am running Filebeat today at around 17:00 (set ignore_older to 24h), so at this point I have my index name as user_log-2019.07.01, i.e., today's date. So filebeat will take this log file and will create index. And filebeat will take all the changes till tomorrow 17:00 hrs (Is my understanding correct).
After 23:59 of 01.07.2019, our time and date will change and a new log file user_log-2019.07.02 will be created and I want filebeat should take this file as input. So what condition shall I set for ignore_older.
The ignore_older is independent to the new files.
If you have used a wildcard in your path setting filebeat will get the new files without any issue and will stop of watching them after given hours.
So for example imagine that today its 2019.01.01 and you want to ignore the files from 2018 for the logs the name log-.log and are under /var/log/example.
If we want to get newer files automatically what you need its to set the paths parameter as follows:
With this configuration our filebeat will send to the previusly configured ouput any file which match the pattern "/var/log/example/log-*.log" but which doesn't have more than 17hours with beeying modified.
If you want can modify it to 24h some days before when there is no chance to get data that you dont want to have.
I would test on test environment even the removal of the parameter to check if after create the sincedb filebeat he would read older files or not.
I still have doubt. So, with the setting 17h it will not take that file which has been modified more than 17 hours ago.
But as I asked for my requirement, today's log file named user_log-2019.07.01 gets updated till 23:59:59 of 01.07.2019. And at 00:00:00 of 02.07.2019, a new log file named user_log-2019.07.02 will be created. So, on 02.07, I want my filebeat to take this new log of 02.07.2019. It will not take any of the previous day's logs.
So, what needs to be set for ignore_older for the above mentioned condition.
At 00:00 of 02.07.2019 it will start with the 02.07.2019 file and if 01.07.2019 file doesn't have more lines it will stop watching for new lines after 17h in this file while at same time its ingesting the 02.07.2019 file, filebeat won't take during this 17h any line of the 01.07.2019 file unless it have a new line which should be impossible because no process should write on yesterdays log.
Yeah now its clear. So at 00:00 of 02.07.2019, filebeat will take 02.07.2019 file and after 17h as there would be no entry in 01.07.2019 log file, filebeat will not consider 01.07.2109 log file (but this would be after 17h).
But I don't want any entries of 01.07.2019 log file after 00:00 of 02.07.2019, which means at 00:00 of 02.07.2019, in the filebeat index I must have only 02.07.2019 entries. No entries of 01.07.2019 are allowed. So, how can this be achieved.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.