Duplicate Docs Getting Created

We are running fluentd in our Kubernetes cluster as DaemonSet and logs are getting shipped to Elasticsearch directly. We are observing duplicate logs are being stored in ES i.e. for a single line of log, we are seeing multiple docs. What could be the reason of the duplication?

Mar 7, 2024 @ 13:59:53.110	Received request with SearchGuid - ea1c7da0-ebf1-43e6-acf5-15d6b5ed2008, ContractId - 6, TransactionId - None at 2024-03-07 13:59:53.110201-05:00	Mar 7, 2024 @ 08:59:53.110

Mar 7, 2024 @ 13:59:53.110	Received request with SearchGuid - ea1c7da0-ebf1-43e6-acf5-15d6b5ed2008, ContractId - 6, TransactionId - None at 2024-03-07 13:59:53.110201-05:00	Mar 7, 2024 @ 08:59:53.110

Mar 7, 2024 @ 13:59:53.110	Received request with SearchGuid - ea1c7da0-ebf1-43e6-acf5-15d6b5ed2008, ContractId - 6, TransactionId - None at 2024-03-07 13:59:53.110201-05:00	Mar 7, 2024 @ 08:59:53.110

Mar 7, 2024 @ 13:59:53.110	Received request with SearchGuid - ea1c7da0-ebf1-43e6-acf5-15d6b5ed2008, ContractId - 6, TransactionId - None at 2024-03-07 13:59:53.110201-05:00	Mar 7, 2024 @ 08:59:53.110

Mar 7, 2024 @ 13:59:53.110	Received request with SearchGuid - ea1c7da0-ebf1-43e6-acf5-15d6b5ed2008, ContractId - 6, TransactionId - None at 2024-03-07 13:59:53.110201-05:00	Mar 7, 2024 @ 08:59:53.110

You'll probably need to debug within Fluentd.

Elasticsearch can duplicate docs if they're received multiple times (exactly once delivery isn't a thing), and we rely on the client to generate unique ids if duplication is a risk/problem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.