Filebeat version is 1.2.3
Filebeat server is Windows 2008 R2
Filebeat is sending logs to Logstash on a RedHat Enterprise linux Server.
On the same server there is an Elasticsearch Instance installed.
Elastic + Logstash are Version 2.1.1
Max events Per second was 1255 events.
A lot of processing work is being done by the Logstash instance.
On the index level, we also using nGram analysis.
The filebeat config file, holds 22 Different prospectors.
These are different log files on the same machine, in which we are distinguish by type of log file.
Some are multiline, but it is being handled on the logstash side, not yet on the filebeat side.
My first idea was that multiline could cause the high cpu but as you wrote that happens on the LS side. Can you share your filebeat config file to see if there are perhaps some other potential issues? What is your scan frequency?
I finally also responded on the other thread. I'm curious if the high CPU usage could be related to the number of files you are harvesting and that you potentially hit a limit there? How many files do you have know harvesting at the same time? Did you have the same issues with 5.0?
I am currently only on DEV stage with Version 5, since I would like to switch Logstash with the Ingest node.
Each prospector is defined to work against a certain folder with one active log file.
Of course that there can be a rotation, and within a peak hour, there can be few still active for the same prospector.
However, in the config, there is a parameter of close older with 5m value.
What is the best way to check it ?
I think that also the mechanism which track last modified date for files listed in the registry which are still present in the folder, has also something to do with it.
Do you see the issue also no your Dev machines with Filebeat 5.0 builds?
What is the best way to check it? To check what? How many files are harvested? How many states are there? Best way is the log files. If you enable debug level, all the information and details should be there. Logging has also been improved for 5.0 and already with INFO level the necessary infos should be there. Every 30s infos are printed out?
Not 100% sure I understand this. Are you referring to how often the registry file is written? What is the size of your registry file?
One more thing, with Filebeat 5 Alpha 4, The filebeat is not releasing it's lock on the file after CLOSE_INACTIVE is reached.
I tryied to delete it, but could not, also tested with CLOSE_REMOVED and still could not delete it, the lock was still there, and so was the file.
Could it be that somehow the file is still harvested? Did you see somewhere a message like couldn't remove state because not Finished? close_removed will only apply after the file is removed. So you should be able to remove the file and then the handler is removed.
Parameter CLOSE_INACTIVE is replacing CLOSE_OLDER, am I right ?
I waited the CLOSE_INACTIVE time, and after that deleted the file, It was still in the folder.
After stopping the Filebeat, the file disappeared from the folder.
No extra data was written to the file... So it was inactive.
Yes, close_inactive replaces close_older. close_inactive is the time since the harvester was last reading a line (which is not necessarly the update time. Can you check the logs if you had some messages inside that the file was closed (or not)? Debug mode is probably needed for that.
2016-09-08T15:55:17+03:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=-1 publish.events=1 registrar.state_updates=1 filebeat.harvester.closed=1 filebeat.harvester.running=-1
It looks ok now.
In a second test, found:
2016-09-08T16:46:01+03:00 INFO Read line error: file inactive
2016-09-08T16:46:01+03:00 DBG Stopping harvester for file: C:\Program Files\filebeat\input\micrositehits.log
2016-09-08T16:46:01+03:00 DBG Stopping harvester, closing file: C:\Program Files\filebeat\input\micrositehits.log
2016-09-08T16:46:01+03:00 DBG Update state: C:\Program Files\filebeat\input\micrositehits.log, offset: 309242
2016-09-08T16:46:06+03:00 DBG Flushing spooler because of timeout. Events flushed: 1
One think that started worrying me when I was reading your log lines is this:
2016-09-08T15:39:25+03:00 DBG delete old: remove c:/Program Files/Filebeat/registry/filebeat_micrositehits.registry.old: The system cannot find the file specified.
It seems like the registry file can't be rewritten properly? Is that potentially because of manual edits?
You remove it before start and do not edit it during running? If yes, it should not have any effect As it does not always appear, I assume permissions are as expected? On shutdown, the registry file has the correct content?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.