I'm trying to create a situation where I have the option of replaying logs with zero risk of duplication. It seems to me that a good way to do that would be to make a field, possibly UUID or _id, be a consistent has of the input. Data is a slightly modified apache log.
Our environment is such that "at least once" delivery is much much easier than "exactly once". Any pointers on how to achieve this? I'm using log-courier on the edge (the logstash-forwarder fork) forwarding to logstash running locally on single-node ES cluster.
I've seen a few custom approaches to doing this (eg How to create my own document_id in logstash?) but I'm surprised there isn't a Supported Way To Do It - maybe there is and I can't find it?
Thanks in advance,