--> a bit later I got often these messages:
2018-03-01T18:01:59.254Z DEBUG [prospector] log/prospector.go:147 Start next scan
2018-03-01T18:01:59.254Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 0, After: 0
2018-03-01T18:01:59.349Z DEBUG [transport] transport/client.go:201 handle error: read tcp [::1]:59322->[::1]:5044: i/o timeout
2018-03-01T18:01:59.349Z ERROR logstash/async.go:235 Failed to publish events caused by: read tcp [::1]:59322->[::1]:5044: i/o timeout
2018-03-01T18:01:59.349Z DEBUG [transport] transport/client.go:114 closing
2018-03-01T18:01:59.376Z DEBUG [logstash] logstash/async.go:142 2002 events out of 2002 events sent to logstash host localhost:5044. Continue sending
2018-03-01T18:01:59.376Z DEBUG [logstash] logstash/async.go:99 close connection
2018-03-01T18:01:59.376Z DEBUG [logstash] logstash/async.go:99 close connection
2018-03-01T18:01:59.376Z ERROR logstash/async.go:235 Failed to publish events caused by: client is not connected
2018-03-01T18:02:00.255Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-03-01T18:02:00.255Z DEBUG [prospector] log/prospector.go:147 Start next scan
2018-03-01T18:02:00.255Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 0, After: 0
2018-03-01T18:02:00.377Z ERROR pipeline/output.go:92 Failed to publish events: client is not connected
2018-03-01T18:02:00.377Z DEBUG [logstash] logstash/async.go:94 connect
2018-03-01T18:02:00.403Z DEBUG [logstash] logstash/async.go:142 2002 events out of 2002 events sent to logstash host localhost:5044. Continue sending
2018-03-01T18:02:01.176Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":140,"time":142},"total":{"ticks":610,"time":620,"value":610},"user":{"ticks":470,"time":478}},"info":{"ephemeral_id":"f9fdc99c-0058-44f2-8712-8b36107c1653","uptime":{"ms":120025}},"memstats":{"gc_next":10079840,"memory_alloc":7263936,"memory_total":38221080,"rss":40960}},"filebeat":{"harvester":{"open_files":1,"running":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"batches":2,"failed":4004,"total":4004},"read":{"errors":1},"write":{"bytes":127585}},"pipeline":{"clients":26,"events":{"active":2002,"retry":6006}}},"registrar":{"states":{"current":1}},"system":{"load":{"1":1.86,"15":1.53,"5":1.58,"norm":{"1":0.465,"15":0.3825,"5":0.395}}}}}}
2018-03-01T18:02:01.212Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-03-01T18:02:01.212Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-03-01T18:02:01.212Z DEBUG [prospector] log/prospector.go:147 Start next scan
For me it looks like that filebeat does not receive any ACK from logstash, so it is resending. I did not find any errors in logstash at loglevel INFO.
Also interesting aspect:
The problem does not occur on any logfile. Could unclean logfile (e.g. some unprintable chars inside UTF-8 or Windows-1252 charset ) could lead to such an error?
Don't know if this here is the same or only a similar issue:
I checked logstash's pipeline statistics.
"pipelines": {
"main": {
"events": {
"duration_in_millis": 4798742,
"in": 14641,
"out": 13125,
"filtered": 13125,
"queue_push_duration_in_millis": 13840779
},
"plugins": {
"inputs": [{
"id": "input:beats",
"events": {
"out": 14641,
"queue_push_duration_in_millis": 13840779
},
"current_connections": 17,
"name": "beats",
"peak_connections": 17
}],
So we can see, that there have been much more messages been transferred than my input is long.
BTW, same issue (only checked logstash output here) seems to occur if I use filebat 5.6.7 to ship to logstash 6.2.1. Using same config / logfile, filebeat 5.6.7 to ship to logstash 5.2.0 works like charme.
sorry, I reached the 7k limit for a single post, so I splitted up.