Duplicate entries in elasticsearch with fingerprint and uuid


(Christian) #1

Hello together,

i try to send syslog data from rsyslod via json to logstash and this is working nearly as expected. Unfortunately i can see every event twice in elasticsearch.

First i double checked if logstash send the event only once with:

output {
	elasticsearch {
		flush_size => '100'
		hosts => ["http://localhost:9200"]
		index => "%{type}-%{+YYYY.MM.dd}"
	}
	file {
		path => '/var/log/logstash/%{type}_debug.log'
		codec => rubydebug {
			metadata => true
		}
		#codec => line
	}
}

There is only one entry in the logfile:

{
			"severity" => "5",
	   "sudo_user_src" => "[ID 702911 local5.notice]    user",
			  "hostIP" => "10.3.13.4",
			"sudo_pwd" => "/home/user",
		   "@metadata" => {
		"uuid" => "f34ba503-fcc4-4c09-b0d1-7cfe3a95896b"
	},
			"sudo_tty" => "pts/3",
			 "program" => "sudo",
				"type" => "syslog_solaris",
			"priority" => "173",
			 "message" => "[ID 702911 local5.notice]    user : TTY=pts/3 ; PWD=/home/user ; USER=root ; COMMAND=/usr/bin/su - root -c echo user_test5",
		  "@timestamp" => 2018-06-14T09:25:35.000Z,
	"sudo_user_target" => "root",
				"host" => "svrefa037.domain.com",
			"@version" => "1",
				 "tag" => "sudo:",
			"facility" => "21",
		"sudo_command" => "/usr/bin/su - root -c echo user_test5",
	  "severity_label" => "notice",
	  "facility_label" => "local5"
}

After this i read this document:
https://www.elastic.co/blog/logstash-lessons-handling-duplicates

In my opinion it would be a good idea to add a fingerprint with uuid so i changed my config like this:

filter {
	fingerprint {
		target => "[@metadata][uuid]"
		method => "UUID"
	}
}
output {
	elasticsearch {
		flush_size => '100'
		hosts => ["http://svrl1esmlsp01:9200"]
		document_id => "%{[@metadata][uuid]}"
		index => "%{type}-%{+YYYY.MM.dd}"
	}
}

In this case i have two almost identical entries in elasticsearch. The only different value is the field '_id':
Entry one:

"_id": "AWP9nVM51nozggsAltkf",

Entry two:

"_id": "f34ba503-fcc4-4c09-b0d1-7cfe3a95896b",

After this i recognized a different way to configure it

filter {
	fingerprint {
		target => "%{[@metadata][uuid]}"
		method => "UUID"
	}
}

This way seems to work but i am unsure if this is correct cause i have an '_id' as well as a cryptic field '%{.@metadata.uuid.}' with the uuid

{
  "_id": "AWP9sVJ81nozggsArk-G",
  "_version": 1,
  "_score": null,
  "_source": {
	"@timestamp": "2018-06-14T09:47:26.000Z",
	"%{": {
	  "@metadata": {
		"uuid": {
		  "}": "d05d29f4-8af6-4eb5-babd-b5a21973d99a"
		}
	  }
	}
}

Which way is the correct one for this problem?

Regards,

Christian


(Magnus Bäck) #2

Why would you have duplicate events in the first place? I suspect you have more than one configuration file containing an elasticsearch output.


(Christian) #3

...shame on me. I've had an old configfile named
output-elasticsearch.conf.bak

In my opinion logstash should not read this file cause the ending is not .conf... I've been wrong.

Thanks a lot.


(Magnus Bäck) #4

In my opinion logstash should not read this file cause the ending is not .conf... I've been wrong.

IIRC that's exactly how Logstash 6+ behaves.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.