Failed to publish events caused by: write tcp 10.90.66.80:57738->10.90.66.48:5044: write: connection reset by peer

2018-09-06T16:26:36.086-0400 ERROR logstash/async.go:252 Failed to publish events caused by: write tcp 10.90.66.80:57738->10.90.66.48:5044: write: connection reset by peer
2018-09-06T16:26:37.086-0400 ERROR pipeline/output.go:109 Failed to publish events: write tcp 10.90.66.80:57738->10.90.66.48:5044: write: connection reset by peer
2018-09-06T16:26:45.356-0400 INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":150,"time":{"ms":6}},"total":{"ticks":720,"time":{"ms":11},"value":720},"user":{"ticks":570,"time":{"ms":5}}},"info":{"ephemeral_id":"0cc79eaf-a77d-4b98-af8c-f270ec25b230","uptime":{"ms":2400034}},"memstats":{"gc_next":6226960,"memory_alloc":3131584,"memory_total":141931712}},"filebeat":{"events":{"added":1,"done":1},"harvester":{"open_files":2,"running":2}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":1,"batches":2,"failed":1,"total":2},"read":{"bytes":6},"write":{"bytes":324,"errors":1}},"pipeline":{"clients":1,"events":{"active":0,"published":1,"retry":2,"total":1},"queue":{"acked":1}}},"registrar":{"states":{"current":8,"update":1},"writes":{"success":1,"total":1}},"system":{"load":{"1":0,"15":0,"5":0,"norm":{"1":0,"15":0,"5":0}}}}}}
2018-09-06T16:27:15.356-0400 INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":150},"total":{"ticks":730,"time":{"ms":5},"value":730},"user":{"ticks":580,"time":{"ms":5}}},"info":{"ephemeral_id":"0cc79eaf-a77d-4b98-af8c-f270ec25b230","uptime":{"ms":2430034}},"memstats":{"gc_next":6226960,"memory_alloc":4940048,"memory_total":143740176,"rss":-90112}},"filebeat":{"events":{"added":6,"done":6},"harvester":{"open_files":2,"running":2}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":6,"batches":1,"total":6},"read":{"bytes":6},"write":{"bytes":615}},"pipeline":{"clients":1,"events":{"active":0,"published":6,"total":6},"queue":{"acked":6}}},"registrar":{"states":{"current":8,"update":6},"writes":{"success":1,"total":1}},"system":{"load":{"1":0,"15":0,"5":0,"norm":{"1":0,"15":0,"5":0}}}}}}

There are no error in logstash at this time. I have confirmed that logstash is running and there is a listener on port 5044.

filebeat.yml
root@ip-10-90-66-80:~# cat /etc/filebeat/filebeat.yml
filebeat:
inputs:
- type: log
paths:
- /var/log/vault/**.log
- /var/log/*.log
tags: [vault]

#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["logstash.sand.corvesta.net:5044"]

DNS
root@ip-10-90-66-80:~# dig logstash.sand.corvesta.net

; <<>> DiG 9.11.3-1ubuntu1.1-Ubuntu <<>> logstash.sand.corvesta.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58912
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;logstash.sand.corvesta.net. IN A

;; ANSWER SECTION:
logstash.sand.corvesta.net. 22 IN A 10.90.66.48
logstash.sand.corvesta.net. 22 IN A 10.90.66.92

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Thu Sep 06 16:29:19 EDT 2018
;; MSG SIZE rcvd: 87

logstash config
input {
stdin { }
gelf {
host => "0.0.0.0"
port => 12201
}

udp {
codec => json
port => 5001
}

tcp {
port => 5000
codec => json
}

beats {
port => 5044
}

http {
port => 8000
type => "elb-healthcheck"
}

}

filter {
if [type] == "elb-healthcheck" {
drop { }
}
}

filter {
if [type] == "kube-logs" {

mutate {
  rename => ["log", "message"]
}

date {
  match => ["time", "ISO8601"]
  remove_field => ["time"]
}

grok {
    match => { "source" => "/var/log/containers/%{DATA:pod_name}_%{DATA:namespace}_%{GREEDYDATA:container_name}-%{DATA:container_id}.log" }
    remove_field => ["source"]
}

}
}

filter {

Map log levels to integers, the "level" index is an integer and blows up when it receives a string:w

Log level mappings from: https://docs.python.org/2/library/logging.html

mutate {
gsub => [
"level", "DEBUG", "10",
"level", "INFO", "20",
"level", "WARN", "30",
"level", "ERROR", "40",
"level", "CRITICAL", "50",
"level", "NOTSET", "0"
]
}

Convert log level to an integer (after above mapping)

mutate {
convert => {
"level" => "integer"
}
}
}

output {

if 'test' in [tags] {
    elasticsearch {
        hosts     => ["http://vpc-sand-logs-mtsand4fqaytxaye657cxthpza.us-east-1.es.amazonaws.com:80"]
        index => "test-%{+YYYY.MM.dd}"
    }
}

  else {
    elasticsearch {
        hosts     => ["http://vpc-sand-logs-mtsand4fqaytxaye657cxthpza.us-east-1.es.amazonaws.com:80"]
    }
}

}

I am using the GELF logger for all my containers and it is working just fine. I don't know what else to look at and I don't see any other error other than the peer closing connection. Any help would be amazing.

Could you try to update the beats plugin of LS?

bin/logstash-plugin update logstash-input-beats

If it does not help you could increase client_inactivity_timeout, so Logstash would not close inactive Beats connections too early. See more: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html#plugins-inputs-beats-client_inactivity_timeout

looks like there is an error at that time.

2018-09-07T14:20:53.770-0400 ERROR logstash/async.go:252 Failed to publish events caused by: write tcp 10.90.66.80:42910->10.90.66.92:5044: write: connection reset by peer
2018-09-07T14:20:54.770-0400 ERROR pipeline/output.go:109 Failed to publish events: write tcp 10.90.66.80:42910->10.90.66.92:5044: write: connection reset by peer

[2018-09-07T18:20:54,927][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2018.09.07", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x61e206d7], :response=>{"index"=>{"_index"=>"logstash-2018.09.07", "_type"=>"doc", "_id"=>"nWVDtWUBATnjgFer94MK", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [host]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:181"}}}}}

I didn't notice this before because the clocks don't match. Noticed this after a restart of the logstash server.

I did both of your suggestions and I am still getting this error. Thanks in advance.

This error is caused by a breaking change in Beats which effects everyone forwarding events to Logstash: https://www.elastic.co/guide/en/beats/libbeat/master/breaking-changes-6.3.html#breaking-changes-mapping-conflict

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.