Hi, I am using Nginx as my app server. I have installed filebeat where Nginx is installed.
Filebeat pushes these logs to Logstash. Then logstash pushes forward to elasticsearch.
This setup used to work perfectly fine. But on one particular date ( say Feb 13 for example purpose), I saw a sudden spike in the number of logs. Usually, my application receives around 100K logs day. But on that particular date, there were around 10M log records.
When checked, that day had logs of all the days from Dec 10 to Dec 28. So 13 Feb contained all the previous log records of December. But the logstash was running perfectly before Feb 13. Feb 12, 11, 10 had the expected number of records. But not sure what happened on Feb 13 that triggered a sudden reprocessing of older December logs.
Then I checked in Elasticsearch for log records of December. Those logs had been processed on that particular day. But it again got processed on Feb 13.
Since then it has happened at least 3 more times on random dates.
Not sure how to debug it. Is filebeat a culprit or it is problem with logstash.
Where to debug it?
I would start with Filebeat; there are a number of configurations that could cause it to re-emit events (after all, sometimes that is a desired behaviour).
Do you have any evidence that Filebeat could have been restarted around those times?
Are the files being prospected on a network-attached volume, and if so, do you have any evidence that the volume could have been remounted around those times?
You may also be interested in setting an ignore_older directive, which will ignore all files not modified in the given timespan.
Yes, indeed the filebeat is restarting. We have dockerized the filebeat and whenever the new deployment goes, it fetches the latest image and is restarted. But the logs are stored on instance running the container (using docker volumes). So the old log files always remain. But the registry files are lost when the docker is restarted. So maybe that is the reason for it to reprocess older logs. Correct me if I am wrong.
Now regarding the solution.
So is using ignore_older directive the solution? If yes, what should be the suitable value for it. If we set ignore_older, is it mandatory to set clean_inactive. Please suggest me the best-suited configuration for my case (an nginx server)
I was thinking about one more solution. Why how about exposing /usr/share/filebeat/data directory which contains the registery files to the host instance running the container. In this way, even if the filebeat container restarts during the deployment, the registery files would remain in the host instance like the logs. So it would pick from the correct offset and send only relevant logs.
I am thinking second solution perhaps would be better. Please suggest me the best solution.
It may be best to ask a new question over in the Beats forum; I am not familiar enough with Beats to give recommendations about its use under Docker.
That said, as a mitigating factor within logstash-output-elasticsearch it is possible to specify the document's id instead of letting it autogenerate; if we were to specify a checksum of relevant fields, we could ensure that any duplicate processing would overwrite previously-written entries instead of creating duplicate entries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.