Can we avoid duplicate records with fingerprint plugin or read input only once?


(Ritesh) #1

Hi All

I have a input as http_poller, pulling data from a http endpoint
and output as a http

The configuration file is working well. But the issue i have is, since logstash runs on continuous basis it goes and insert duplicate records in output.

How can i avoid that . I need a solution wherein i can read the input only once?

My dataset is bit different, here is structure

My input data looks like this, there can 'n' number of records with same Name, so i cannot take Name or Area ID or PID as the unique identifier.

Name Area ID PID Account Cost Budget
ABC Corp 1234 5678 20 345 340
ABC Corp 1234 5678 12 456 500

The above one is not a duplicate record. They are different records.But when i keep the pipeline running, it inserts duplicate records

Please let me know, how can i read data only once from input. So that i dont create duplicate records in output.

I think fingerprint a solution, but i do not get any definite example.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.