I am aware how we can exclude a particular log type (line) using filebeat and i have implemented it and its working fine.
But now i am getting 20 line of same log type and i want to exclude 19 of them at filebeat level and want to send only one line to elasticserach.
Below are the the sample log line:
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Nov 2 09:46:44 xyz sshd[32511]: Accepted publickey for xyz
Currently there is no way to remove duplications from filebeat, and in general each event is independent to the previous ones. Removing duplications can be too dependant of the use case.
I guess it could be possible to implement something like multiline, but to detect consecutive duplicated log lines. A field could be added with the number of repetitions too. You can open an issue with this feature request.
I have open a issue at Github. Below is the id of issue:
[http://Duplication removal at filebeat level. #9033](http://Duplication removal at filebeat level. #9033)
I know it's not quite the answer you're looking for, but if you can first send the events to a Logstash node, you could use the throttle filter or even the drop filter with percentage option.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.