I am testing the Filebeat version 5.0 Alpha 5.
I need to read 22 different logs from one server, when Filebeat runs on the same machine, it consumes a lot of CPU (reaches 50%-60%) constantly.
The application server is installed on Windows.
I cannot add more CPU resource to the application servers (Weblogic) due to licensing costs.
We have 5 Application servers.......
So I have set up a server (RHEL) which has filebeat installed and running on it, using CIFS, the filebeat accesses the Application servers and reads logs data.
However, I have encountered multiple events in the Elasticsearch, as it seems, the filebeat reads the files again, although in the registry it is updated regarding the offset and file condition.
This is the configuration I need to use, accessing the log files using some kind of a network share, So I need Filebeat to be able to take the data without re-reading it and duplicate it.
Is it possible that the inode of the file or the device id changes? I saw in the past such cases happening on Windows shares. As filebeat detects files based on inode and device id, that would explain why some files appear more then once. Can you check the registry if there are some files listed multiple times with different inodes / devices?
I would still hope we can get the resource usage under control on the Windows machine. Does it only peak at 50-60% or is it constantly at this level? Can you link the other forum post with the previous discussions for others to follow?
In the registry file, there is only one entry with that file.
I have evidence of the registry file being updated with the offsets of the myapp.log file, while sometimes it is being increased as:
1004589
1005721
1006788
etc..
But there are cases in which the offset begins from 0 again, and the whole file is being read all over and re-sent to ES.
There are repeats of that.
I was following the registry file, simply by printing its content.
Remember that there was only one log file.
As a Test:
I created a small script in windows with a counter that will infinitely print the lines as:
Current counter: 1
Current counter: 2
Current counter: 3
.....
Then the filebeat on linux is reading that file and output the lines to an output file local on linux machine.
Checking the registry file, I saw the exact behaviour, and the output file contained duplicated lines.
Within the registry file, the INODE and DEVICE_ID were the same for the whole process.
I started to test Windows to Windows using shares, in the meanwhile seems to be OK and not duplicating.
I am using the same COUNTER test.
Should it be OK working with shares from Windows to Windows, instead of CIFS between Linux and Windows ?
Regarding the CPU, it is constantly that high.
There are 22 Prospectors defined in the YML file.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.