Hi,
My setup has been working perfectly for a week now but today I noticed that an entire logfile got submitted twice (the whole content was written around the same time so it was probably transmitted as one batch).
The only Thing I noticed was the folllowing in the filebeat log:
2018-11-07T07:45:09.878+0100	INFO	log/harvester.go:251	Harvester started for file: C:\logs\download.log
2018-11-07T07:45:10.879+0100	ERROR	logstash/async.go:256	Failed to publish events caused by: write tcp 10.122.65.225:54214->10.122.65.223:5044: wsasend: An existing connection was forcibly closed by the remote host.
2018-11-07T07:45:11.881+0100	ERROR	pipeline/output.go:121	Failed to publish events: write tcp 10.122.65.225:54214->10.122.65.223:5044: wsasend: An existing connection was forcibly closed by the remote host.
2018-11-07T07:45:11.881+0100	INFO	pipeline/output.go:95	Connecting to backoff(async(tcp://10.122.65.223:5044))
2018-11-07T07:45:11.881+0100	INFO	pipeline/output.go:105	Connection to backoff(async(tcp://10.122.65.223:5044)) established
2018-11-07T07:45:26.983+0100	INFO	[monitoring]	log/log.go:141	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":16146},"total":{"ticks":21949,"value":21949},"user":{"ticks":5803}},"info":{"ephemeral_id":"9a74d02c-78cc-4eff-8223-6200a82df1a7","uptime":{"ms":598530415}},"memstats":{"gc_next":6522208,"memory_alloc":3494712,"memory_total":9663826592,"rss":4136960}},"filebeat":{"events":{"added":23,"done":23},"harvester":{"open_files":2,"running":2,"started":1},"input":{"log":{"files":{"renamed":1}}}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":21,"batches":3,"failed":20,"total":41},"read":{"bytes":12},"write":{"bytes":1165,"errors":1}},"pipeline":{"clients":1,"events":{"active":0,"filtered":2,"published":21,"retry":40,"total":23},"queue":{"acked":21}}},"registrar":{"states":{"current":43,"update":23},"writes":{"success":3,"total":3}}}}}
2018-11-07T07:45:39.902+0100	INFO	log/harvester.go:251	Harvester started for file: C:\logs\production.log
So it would appear that there was some network issue but the events seem to have been submitted despite that.
Anyway to prevent this in future ?
I mean we should find out what the network issue was on our end but they can happen and I'd hate to end-up with duplicated logs everytime it does. I may not always notice either.
I would not like to set some field in our logs as the id of the documents in ES as I don't really see a candidate field and even if I did, I'd be hesitant to apply deduplication.
I hope there is a way to eleviate this issue.
Filebeats config:
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - C:\logs\*\*
  close_inactive: 5h
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
setup.template.settings:
  index.number_of_shards: 3
setup.kibana:
output.logstash:
  hosts: ["10.122.65.223:5044"]
Logstsash config:
input {
  beats {
    port => 5044
    host => "0.0.0.0"
  }
}
filter {
  grok {
    match => { "message" => ["\A%{TIMESTAMP_ISO8601}\s-\s[a-zA-Z0-9]+\s-\s%{LOGLEVEL:log-level}\s-\sBestelltnummer:\s(?<bestellnummer>[0-9]{9}),\sILN:\s(?<iln>[0-9]{13}),\sKundenNr\. (?<kundennr>[0-9]{6}),\s(?<stueckzahl>[0-9]{1,3})\sST,\sArtNr.:\s(?<artikelnr>[0-9]{13,14})", "\A%{TIMESTAMP_ISO8601}\s-\s[a-zA-Z0-9]+\s-\s%{LOGLEVEL:log-level}\s-\s%{DATA}:\s(?<ris-docid>%{GREEDYDATA})", "\A%{TIMESTAMP_ISO8601}\s-\s[a-zA-Z0-9]+\s-\s%{LOGLEVEL:log-level}\s-\s"] }
  }
}
filter {
  mutate {
    convert => { "stueckzahl" => "integer" }
  }
}
output {
  elasticsearch {
    hosts => ["myhostinthecloud:9243"]
    #index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    user => xxx
    password => xxx
  }
}
/EDIT:
I did add another config yesterday in logstash/conf.d but I don't see how it would that would be causing it:
input {
   gelf {
    port => 12201
    type => gelf
    host => "0.0.0.0"
  }
}
output {
  elasticsearch {
    hosts => ["myhostinthecloud:9243"]
        #index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
        user => xxx
        password => xxx
  }
}
            