Hi,
When I stated my filebeat in debug mode I am seeing below logs -
2018-04-17T10:42:22.876Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-17T10:42:22.876Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 2, After: 2
2018-04-17T10:42:28.025Z DEBUG [transport] transport/client.go:201 handle error: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:28.025Z DEBUG [logstash] logstash/async.go:142 2048 events out of 2048 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-17T10:42:28.025Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-17T10:42:28.025Z DEBUG [transport] transport/client.go:114 closing
2018-04-17T10:42:28.025Z ERROR logstash/async.go:235 Failed to publish events caused by: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:28.025Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-17T10:42:29.026Z ERROR pipeline/output.go:92 Failed to publish events: write tcp 172.31.11.2:33918->172.31.1.52:5044: i/o timeout
2018-04-17T10:42:29.026Z DEBUG [logstash] logstash/async.go:94 connect
2018-04-17T10:42:32.876Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-04-17T10:42:32.876Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:147 Start next scan
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /var/log/mesos/mesos-agent.log
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /var/log/mesos/mesos-agent.log, offset: 338883
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /var/log/mesos/mesos-agent.log
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 1, After: 1
2018-04-17T10:42:32.876Z DEBUG [prospector] log/prospector.go:147 Start next scan
filebeat.yml
---
filebeat.prospectors:
- type: log
paths:
- /dcos/volume1/carbook-test-nginx/logs/access*
- /dcos/volume1/carbook-test-nginx/logs/error*
fields:
log_type: cb-test-nginx
- type: log
paths:
- /var/log/mesos/*.log
fields:
log_type: mesos-log
exclude_files: ["stdout.logrotate.state", "stdout.logrotate.conf", "stderr.logrotate.state", "stderr.logrotate.conf"]
output.logstash:
hosts:
- "172.31.1.52:5044"
bulk_max_size: 4096
logging.level: debug
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
logstash.conf file under pipeline
input {
beats {
port => "5044"
client_inactivity_timeout => "600"
}
}
filter {
if [fields][log_type] == "cb-test-nginx" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG} %{BASE16FLOAT} %{BASE16FLOAT} ."}
}
}
else if [fields][log_type] == "mesos-log" {
grok {
match => { "message" => "%{GREEDYDATA}"}
}
}
}
output {
if [fields][log_type] == "cb-test-nginx" {
elasticsearch {
hosts => ["172.31.1.37:9201","172.31.1.46:9201","172.31.1.47:9201"]
index => "cb-test-nginx-%{+YYYY.MM.dd}"
}
}
else if [fields][log_type] == "mesos-log" {
elasticsearch {
hosts => ["172.31.1.37:9201","172.31.1.46:9201","172.31.1.47:9201"]
index => "mesos-log-%{+YYYY.MM.dd}"
}
}
}
Hi,
The timeout occurs when waiting for the ACK signal from logstash. The default timeout is 60 seconds. If logstash is actively processing a batch of events, it sends a ACK signal every 5 seconds.
Reasons why this signal is not received by filebeat can be either network or contention in logstash (induced by additional back-pressure on outputs).
pls try with this if it is helpful for you.
Thanks,
Harsh Bajaj
Hi, Thanks,
I tried with bulk_max_size: 1024 in filebeat.yml but I am geeting similar error.
2018-04-18T08:58:39.634Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T08:58:39.634Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T08:58:39.634Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 665238
2018-04-18T08:58:39.634Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T08:58:39.634Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 2, After: 2
2018-04-18T08:58:44.767Z DEBUG [transport] transport/client.go:201 handle error: read tcp 172.31.11.2:33100->172.31.1.52:5044: i/o timeout
2018-04-18T08:58:44.767Z ERROR logstash/async.go:235 Failed to publish events caused by: read tcp 172.31.11.2:33100->172.31.1.52:5044: i/o timeout
2018-04-18T08:58:44.767Z DEBUG [transport] transport/client.go:114 closing
2018-04-18T08:58:44.767Z DEBUG [transport] transport/client.go:201 handle error: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:44.767Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T08:58:44.767Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T08:58:44.767Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T08:58:44.767Z ERROR logstash/async.go:235 Failed to publish events caused by: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:45.768Z ERROR pipeline/output.go:92 Failed to publish events: write tcp 172.31.11.2:33100->172.31.1.52:5044: use of closed network connection
2018-04-18T08:58:45.768Z DEBUG [logstash] logstash/async.go:94 connect
2018-04-18T08:58:45.777Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
^C
This is the different error showing "closed network connection". Could you please telnet on port 5044. i think filebeat is not able to connect with 5044 port.
Thanks,
Hi,
I am able to do telnet to my logstash node on 5044.
[root@compute1 filebeat]# telnet 172.31.1.52 5044
Trying 172.31.1.52...
Connected to 172.31.1.52.
Escape character is '^]'.
Hi,
Thank you!
Logstash could be resetting the connection due to inactivity, in which case this shouldn't be a problem. You can try increasing the client_inactivity_timeout 842.
But in logs, it looks like it's happening when sending. This could mean there is something blocking the Logstash pipeline.
Does the problem occur if you output only to stdout or to a file from Logstash?
Thanks,
Harsh
Hi Harsh,
I am trying to do stdout only to logstash.
I have increased client_inactivity_timeout to 842 but no gain.
2018-04-18T09:35:00.725Z DEBUG [transport] transport/client.go:201 handle error: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:00.725Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T09:35:00.725Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T09:35:00.725Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T09:35:00.725Z ERROR logstash/async.go:235 Failed to publish events caused by: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:01.572Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:147 Start next scan
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log, offset: 610346
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 722692
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T09:35:01.573Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 2, After: 2
2018-04-18T09:35:01.726Z ERROR pipeline/output.go:92 Failed to publish events: write tcp 172.31.11.2:53340->172.31.1.52:5044: use of closed network connection
2018-04-18T09:35:01.726Z DEBUG [logstash] logstash/async.go:94 connect
2018-04-18T09:35:01.736Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T09:35:11.573Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-04-18T09:35:11.573Z DEBUG [prospector] log/prospector.go:147 Start next scan
Hi,
Unfortunately i'm not able to reproduce the same error on my machine each and everything is working fine.
could you please comment your filter plugins and restart your service this is for trail because logstash is not receiving the data there is some blocking in logstash pipeline so please try with this and let me know if it works.
Thanks,
Hi Harsh,
Still the same error, I deleted the filter section for that particular log in logstash and restated.
Don't know what's wrong. I have filebeat runner on other nodes and that is working fine but not on this node where my nginx is running.
018-04-18T10:40:09.809Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T10:40:09.809Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T10:40:09.809Z DEBUG [logstash] logstash/async.go:99 close connection
2018-04-18T10:40:09.809Z ERROR logstash/async.go:235 Failed to publish events caused by: write tcp 172.31.11.2:49364->172.31.1.52:5044: use of closed network connection
2018-04-18T10:40:10.809Z ERROR pipeline/output.go:92 Failed to publish events: write tcp 172.31.11.2:49364->172.31.1.52:5044: use of closed network connection
2018-04-18T10:40:10.809Z DEBUG [logstash] logstash/async.go:94 connect
2018-04-18T10:40:10.819Z DEBUG [logstash] logstash/async.go:142 1024 events out of 1024 events sent to logstash host 172.31.1.52:5044. Continue sending
2018-04-18T10:40:18.719Z DEBUG [prospector] prospector/prospector.go:124 Run prospector
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:147 Start next scan
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/access.log, offset: 593470
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/access.log
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:361 Check file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:447 Update existing file for harvesting: /dcos/volume1/carbook-test-nginx/logs/error.log, offset: 752242
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:499 Harvester for file is still running: /dcos/volume1/carbook-test-nginx/logs/error.log
2018-04-18T10:40:18.719Z DEBUG [prospector] log/prospector.go:168 Prospector states cleaned up. Before: 2, After: 2
Hi,
Have you restarted both services i.e. Logstash and filebeat or only logstash?
Hi Harsh,
Issue got resolved after deep troubleshooting. Actually I am running my logstash and elastic cluster as docker in overlay network,Also the nginx node where the filebeat is running and the logstash are in different vlans. Still I didn't find where the packet is getting dropped. But after running my logstash container in bridge mode, it started working.
Thanks a lot for the help.