Removing ongoing duplicates

This is what I need however I would like do it on input due parsing jsons. What I mean this tutorial show how do it post-factum, but I would like do it every time I recieve new data basing for example on data from last 1 minute or so. Is this possible and will not generate high load? When I recieve data to logstash they are duplicates in pairs, like system logs when same line will be 10 times but in successive so it is no needed to check like whole index, just last records, even like last 10 records etc. What I need is removing ongoing duplicates.

You could use a ruby filter. Keep the most recent messages (or a hash of them) in an array and test whether the array .include? the current message. Then .shift to remove the first entry and .push to add the current message as the last entry. The cost grows with the number of entries in the array, since Ruby will iterate over each entry in .include?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.