Filebeat throwing i/o timeout while sending logs to logstash

Hi,

When I stated my filebeat in debug mode I am seeing below logs -

2018-04-17T10:42:22.876Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-17T10:42:22.876Z	DEBUG	[prospector]	log/prospector.go:168	Prospector states cleaned up. Before: 2, After: 2
2018-04-17T10:42:28.025Z	DEBUG	[transport]	transport/client.go:201	handle error: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:28.025Z	DEBUG	[logstash]	logstash/async.go:142	2048 events out of 2048 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-17T10:42:28.025Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-17T10:42:28.025Z	DEBUG	[transport]	transport/client.go:114	closing
2018-04-17T10:42:28.025Z	ERROR	logstash/async.go:235	Failed to publish events caused by: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:28.025Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-17T10:42:29.026Z	ERROR	pipeline/output.go:92	Failed to publish events: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:29.026Z	DEBUG	[logstash]	logstash/async.go:94	connect
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	prospector/prospector.go:124	Run prospector
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	prospector/prospector.go:124	Run prospector
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:147	Start next scan
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /var/log/mesos/mesos-agent.log
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /var/log/mesos/mesos-agent.log, offset: 338883
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /var/log/mesos/mesos-agent.log
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:168	Prospector states cleaned up. Before: 1, After: 1
2018-04-17T10:42:32.876Z	DEBUG	[prospector]	log/prospector.go:147	Start next scan

filebeat.yml

---
filebeat.prospectors:
- type: log
  paths:
    - /dcos/volume1/carbook-test-nginx/logs/access*
    - /dcos/volume1/carbook-test-nginx/logs/error*
  fields:
     log_type: cb-test-nginx

- type: log
  paths:
    - /var/log/mesos/*.log
  fields:
     log_type: mesos-log
exclude_files: ["stdout.logrotate.state", "stdout.logrotate.conf", "stderr.logrotate.state", "stderr.logrotate.conf"]
  
output.logstash:
  hosts:
    - "172.31.1.52:5044"
  bulk_max_size: 4096
logging.level: debug
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644

logstash.conf file under pipeline


input {

        beats {
        port => "5044"
        client_inactivity_timeout => "600"
      }

}

filter {
    if [fields][log_type] == "cb-test-nginx" {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG} %{BASE16FLOAT} %{BASE16FLOAT} ."}
    }
    }
else if [fields][log_type] == "mesos-log" {
    grok {
      match => { "message" => "%{GREEDYDATA}"}
    }
}

}
output {
       if [fields][log_type] == "cb-test-nginx" { 
        elasticsearch {
        hosts => ["172.31.1.37:9201","172.31.1.46:9201","172.31.1.47:9201"]
        index => "cb-test-nginx-%{+YYYY.MM.dd}"
    }
}
else if [fields][log_type] == "mesos-log" {
        elasticsearch {
        hosts => ["172.31.1.37:9201","172.31.1.46:9201","172.31.1.47:9201"]
        index => "mesos-log-%{+YYYY.MM.dd}"
    }
}

}

Hi,

The timeout occurs when waiting for the ACK signal from logstash. The default timeout is 60 seconds. If logstash is actively processing a batch of events, it sends a ACK signal every 5 seconds.

Reasons why this signal is not received by filebeat can be either network or contention in logstash (induced by additional back-pressure on outputs).

pls try with this if it is helpful for you.

Thanks,
Harsh Bajaj

Hi, Thanks,

I tried with bulk_max_size: 1024 in filebeat.yml but I am geeting similar error.

2018-04-18T08:58:39.634Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T08:58:39.634Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T08:58:39.634Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 665238
2018-04-18T08:58:39.634Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T08:58:39.634Z	DEBUG	[prospector]	log/prospector.go:168	Prospector states cleaned up. Before: 2, After: 2
2018-04-18T08:58:44.767Z	DEBUG	[transport]	transport/client.go:201	handle error: read tcp 172.31.11.2:33100->172.31.1.52:5044: i/o timeout
2018-04-18T08:58:44.767Z	ERROR	logstash/async.go:235	Failed to publish events caused by: read tcp 172.31.11.2:33100->172.31.1.52:5044: i/o timeout
2018-04-18T08:58:44.767Z	DEBUG	[transport]	transport/client.go:114	closing
2018-04-18T08:58:44.767Z	DEBUG	[transport]	transport/client.go:201	handle error: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:44.767Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T08:58:44.767Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T08:58:44.767Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T08:58:44.767Z	ERROR	logstash/async.go:235	Failed to publish events caused by: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:45.768Z	ERROR	pipeline/output.go:92	Failed to publish events: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:45.768Z	DEBUG	[logstash]	logstash/async.go:94	connect
2018-04-18T08:58:45.777Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
^C

This is the different error showing "closed network connection". Could you please telnet on port 5044. i think filebeat is not able to connect with 5044 port.

Thanks,

Hi,

I am able to do telnet to my logstash node on 5044.

[root@compute1 filebeat]# telnet 172.31.1.52 5044
Trying 172.31.1.52...
Connected to 172.31.1.52.
Escape character is '^]'.

Hi,
Thank you!

Logstash could be resetting the connection due to inactivity, in which case this shouldn't be a problem. You can try increasing the client_inactivity_timeout 842.

But in logs, it looks like it's happening when sending. This could mean there is something blocking the Logstash pipeline.
Does the problem occur if you output only to stdout or to a file from Logstash?

Thanks,
Harsh

Hi Harsh,

I am trying to do stdout only to logstash.

I have increased client_inactivity_timeout to 842 but no gain.

2018-04-18T09:35:00.725Z	DEBUG	[transport]	transport/client.go:201	handle error: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:00.725Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T09:35:00.725Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T09:35:00.725Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T09:35:00.725Z	ERROR	logstash/async.go:235	Failed to publish events caused by: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:01.572Z	DEBUG	[prospector]	prospector/prospector.go:124	Run prospector
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:147	Start next scan
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log, offset: 610346
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 722692
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T09:35:01.573Z	DEBUG	[prospector]	log/prospector.go:168	Prospector states cleaned up. Before: 2, After: 2
2018-04-18T09:35:01.726Z	ERROR	pipeline/output.go:92	Failed to publish events: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:01.726Z	DEBUG	[logstash]	logstash/async.go:94	connect
2018-04-18T09:35:01.736Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T09:35:11.573Z	DEBUG	[prospector]	prospector/prospector.go:124	Run prospector
2018-04-18T09:35:11.573Z	DEBUG	[prospector]	log/prospector.go:147	Start next scan

Hi,

Unfortunately i'm not able to reproduce the same error on my machine each and everything is working fine.

could you please comment your filter plugins and restart your service this is for trail because logstash is not receiving the data there is some blocking in logstash pipeline so please try with this and let me know if it works.

Thanks,

Hi Harsh,

Still the same error, I deleted the filter section for that particular log in logstash and restated.
Don't know what's wrong. I have filebeat runner on other nodes and that is working fine but not on this node where my nginx is running.

018-04-18T10:40:09.809Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T10:40:09.809Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T10:40:09.809Z	DEBUG	[logstash]	logstash/async.go:99	close connection
2018-04-18T10:40:09.809Z	ERROR	logstash/async.go:235	Failed to publish events caused by: write tcp 172.31.11.2:49364->172.31.1.52:5044: use of closed network connection
2018-04-18T10:40:10.809Z	ERROR	pipeline/output.go:92	Failed to publish events: write tcp 172.31.11.2:49364->172.31.1.52:5044: use of closed network connection
2018-04-18T10:40:10.809Z	DEBUG	[logstash]	logstash/async.go:94	connect
2018-04-18T10:40:10.819Z	DEBUG	[logstash]	logstash/async.go:142	1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	prospector/prospector.go:124	Run prospector
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:147	Start next scan
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log, offset: 593470
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:361	Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:447	Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 752242
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:499	Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T10:40:18.719Z	DEBUG	[prospector]	log/prospector.go:168	Prospector states cleaned up. Before: 2, After: 2

Hi,

Have you restarted both services i.e. Logstash and filebeat or only logstash?

Hi Harsh,

Issue got resolved after deep troubleshooting. Actually I am running my logstash and elastic cluster as docker in overlay network,Also the nginx node where the filebeat is running and the logstash are in different vlans. Still I didn't find where the packet is getting dropped. But after running my logstash container in bridge mode, it started working.

Thanks a lot for the help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.