Save only unique events that are not yet stored in ES based on field value

Hi all,

I am using Logstash with a http_poller. Basically I am querying an API and save the data in ELK. All is fine, but the situation is that I am saving the same events and I need unique ones.

I would like to save only those events that are not already in the DB. If the value from data.attributes.cveID is not already saved in the ELK, then save the objects, else drop the event.

Screenshot 2022-04-04 at 10.51.04

I.e if there is no CVE-2022-21698 in ES, then save it else drop the event.

How I can do this?
Screenshot 2022-04-04 at 10.50.51

PFA my logstash.conf


input { 
  http_poller {
    urls => {
      test2 => {
        method => get
        url => "http://localhost:1337/api/webs/"
      }
    }
    request_timeout => 60
    schedule => { cron => "* * * * * UTC"}
    codec => "json"
    metadata_target => "http_poller_metadata"
  }
}

filter {
  split { field => "[data]" }
}

output {
	elasticsearch {
		hosts => "elasticsearch:9200"
		user => "logstash_internal"
		password => "${LOGSTASH_INTERNAL_PASSWORD}"
	}
}


If you could help me, I will owe you a lot!

You can do this by overwriting the document each time.

Use document_id => %{[data][attributes][cveID]} in the elasticsearch output.

If you really want to drop the document then you could use an elasticsearch filter to do a lookup and see if it exists. However, in my very limited experience that is fragile and does not work as well as you would expect (in that case possibly due to the way logstash processes batches).

PFA the output from logstash.conf

output {
	elasticsearch {
		hosts => ["http://elasticsearch:9200"]
		user => "logstash_internal"
		password => "${LOGSTASH_INTERNAL_PASSWORD}"
		document_id => %{[attributes][cveID]}
	}
}

I got an error:


[logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", [A-Za-z0-9_-], '\"', \"'\", [A-Za-z_], \"-\", [0-9], \"[\", \"{\" at line 36, column 18 (byte 725) after output {\n\telasticsearch {\n\t\thosts => [\"http://elasticsearch:9200\"]\n\t\tuser => \"logstash_internal\"\n\t\tpassword => \"${LOGSTASH_INTERNAL_PASSWORD}\"\n\t\tdocument_id => ", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:189:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:72:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:47:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:50:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:376:in `block in converge_state'"]}

Add double quotes around the value: "%{[attributes][cveID]}".

it works like a charm! Thank you!
You're the best!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.