_id as a consistent hash to avoid duplication with replays


#1

Hi All,

I'm trying to create a situation where I have the option of replaying logs with zero risk of duplication. It seems to me that a good way to do that would be to make a field, possibly UUID or _id, be a consistent has of the input. Data is a slightly modified apache log.

Our environment is such that "at least once" delivery is much much easier than "exactly once". Any pointers on how to achieve this? I'm using log-courier on the edge (the logstash-forwarder fork) forwarding to logstash running locally on single-node ES cluster.

I've seen a few custom approaches to doing this (eg How to create my own document_id in logstash?) but I'm surprised there isn't a Supported Way To Do It - maybe there is and I can't find it?

Thanks in advance,
Tom


#2

solved in How to create my own document_id in logstash?


(system) #3