I am using Logstash with a http_poller. Basically I am querying an API and save the data in ELK. All is fine, but the situation is that I am saving the same events and I need unique ones.
I would like to save only those events that are not already in the DB. If the value from data.attributes.cveID is not already saved in the ELK, then save the objects, else drop the event.
I.e if there is no CVE-2022-21698 in ES, then save it else drop the event.
You can do this by overwriting the document each time.
Use document_id => %{[data][attributes][cveID]} in the elasticsearch output.
If you really want to drop the document then you could use an elasticsearch filter to do a lookup and see if it exists. However, in my very limited experience that is fragile and does not work as well as you would expect (in that case possibly due to the way logstash processes batches).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.