Hi! I can see "duplicated events" is a very common issue, but I cannot find the solution for the problem I have, that looks like a very generic one IMHO.
I am using filebeat in a kubernetes cluster (taking care of kubernetes autodiscovery and file log extraction), so I have 8 instances created by the daemon set. It looks like that one per node.
It looks like the file log extraction is replicated 8 times, one per node. Is any easy way of solving this?
Else, I can foresee 3 solutions:
I'll need to deploy one instance taking care of everything (if k8s autodiscovery works from one node to all the clusters).
I'll need to keep these 8 instances + 1 specific one for file log extraction.
Stop using add_id processor and using fingerprint processor, but it would be a waste of energy processing and dropping 7 out of the 8 file reads.
What do you mean by file log extraction? Could you provide your configuration? Usually each Daemonset autodiscover pods/containers on the node where it runs. I don't see any way to discover containers from a different node .
@ChrsMark sorry for not being specific enough. I mean I have 2 inputs: k8s autodiscover (which works perfectly) and file log extraction (which sends each event 8 times), and filebeat is deployed with the automatic daemonset (so once per node) and file logs are read once per node.
I was waiting for an answer for this, but I believe best solution, if I cannot do autodiscovery from different nodes, is the following
"keep these 8 instances for k8s autodiscover + 1 new specific filebeat instancefor file log extraction"
So these logs from which you see the events are being collected from the host? It seems that yes you need only one Filebeat instance to handle this cluster wide input.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.