Hi All
I have a input as http_poller, pulling data from a http endpoint
and output as a http
The configuration file is working well. But the issue i have is, since logstash runs on continuous basis it goes and insert duplicate records in output.
How can i avoid that . I need a solution wherein i can read the input only once?
My dataset is bit different, here is structure
My input data looks like this, there can 'n' number of records with same Name, so i cannot take Name or Area ID or PID as the unique identifier.
Name | Area ID | PID | Account | Cost | Budget |
---|---|---|---|---|---|
ABC Corp | 1234 | 5678 | 20 | 345 | 340 |
ABC Corp | 1234 | 5678 | 12 | 456 | 500 |
The above one is not a duplicate record. They are different records.But when i keep the pipeline running, it inserts duplicate records
Please let me know, how can i read data only once from input. So that i dont create duplicate records in output.
I think fingerprint a solution, but i do not get any definite example.