hi -- I have a fairly simple logstash setup:
- logstash runs on a web server, and sends events to redis
- logstash running on another server takes the events from redis and outputs them into ES.
I am occasionally, but not infrequently, seeing duplicated events showing up in ES. If I look at the log file for the web server, there is only one event but it shows up in ES twice, and each ES document has a different _id. I know the events are duplicates because many of these events have unique IDs in a field specific to the application doing the logging -- the IDs are not sufficient for ES _ids, so I don't use them as such.
The duplicates do not seem to be associated with restarts of logstash, redis, or ES. I am occasionally seeing problems with one server or another connecting to my redis instance, but these problems don't always coincide with the duplicates.
I am running logstash 2.0 and ES 2.0 (I plan on upgrading, but I don't think that this will help this particular problem).
Thanks,
John Ouellette