Filebeat not connecting directly to Elasticsearch from particular machine

Michael1 · June 16, 2016, 7:57pm

Ok, updates ::

I have changed the the following properties to the following values ::

idle_timeout: 10s
max_retries: 5
bulk_max_size: 1
flush_interval: 10

After setting those that way, Filebeat was able to transmit logs all day without any timeouts! Yes!!

However, just a few minutes ago, here came the first new problem, log ::

2016-06-16T18:43:21Z INFO Registry file updated. 1532 states written.
2016-06-16T19:00:37Z ERR Failed to perform any bulk index operations: Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:55450->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:00:37Z INFO Error publishing events (retrying): Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:55450->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:00:37Z INFO send fail
2016-06-16T19:00:37Z INFO backoff retry: 1s
2016-06-16T19:00:38Z INFO Events sent: 1
2016-06-16T19:00:38Z INFO Registry file updated. 1532 states written.
2016-06-16T19:10:57Z INFO Harvester started for file: F:\Logs\ULS\STSP-WFE01-20160616-1910.log
2016-06-16T19:10:57Z INFO Registry file updated. 1533 states written.
2016-06-16T19:17:32Z INFO Read line error: file inactive
2016-06-16T19:22:17Z ERR Failed to perform any bulk index operations: Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:57623->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:22:17Z INFO Error publishing events (retrying): Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:57623->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:22:17Z INFO send fail
2016-06-16T19:22:17Z INFO backoff retry: 1s
2016-06-16T19:22:19Z INFO Events sent: 2
2016-06-16T19:22:19Z INFO Registry file updated. 1533 states written.
2016-06-16T19:40:24Z INFO Read line error: file inactive
2016-06-16T19:41:03Z INFO Harvester started for file: F:\Logs\ULS\STSP-WFE01-20160616-1940.log
2016-06-16T19:41:03Z INFO Registry file updated. 1534 states written.
2016-06-16T19:54:47Z ERR Failed to perform any bulk index operations: Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:57754->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:54:47Z INFO Error publishing events (retrying): Post http://10.0.1.20:9200/_bulk: read tcp 10.183.220.10:57754->10.0.1.20:9200: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
2016-06-16T19:54:47Z INFO send fail
2016-06-16T19:54:47Z INFO backoff retry: 1s
2016-06-16T19:54:50Z INFO Events sent: 2
2016-06-16T19:54:50Z INFO Registry file updated. 1534 states written.
2016-06-16T19:56:00Z INFO Events sent: 1
2016-06-16T19:56:00Z INFO Registry file updated. 1534 states written.

Now it is just repeating that. This is on the IaaS web server (SharePoint). The PaaS (cloud service) is still doing fine. Now when I try to access the index coming from that web server, Kibana is throwing a nasty red error at the top saying index not found

I tried a few minutes later and I was able to access the index in Kibana again for that web server. It seems to have went through a while of 'hiccups' for some reason....could it be related to those "Read line error: file inactive" messages or?

ruflin · June 17, 2016, 10:11am

Based on all the descriptions and behaviour above, I'm quite confident the issue is not directly related to filebeat or Kibana, but to the network connectivity of both to elasticsearch.

As far as I understand, all your request go through the ILB? Did you check the logs of the ILB on why some requests are returned?

Michael1 · June 20, 2016, 3:25pm

This is an ILB configured through PowerShell on Microsoft Azure. Yes, all requests do go through the ILB that contains a backend pool of ES data VM's

Cannot find any logging on that ILB sadly....

ruflin · June 21, 2016, 11:09am

Not sure how we should move forward here as I think in this case the ILB logs could be very helpful.

system · June 27, 2016, 8:15pm

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Filebeat --> Logstash works but directly to Elasticsearch = nothing Beats filebeat	5	542	April 10, 2018
Connection not established between logstash and filebeat Beats filebeat	7	2091	July 5, 2017
Standard "Filebeat won't send to Logstash" question Beats filebeat	5	364	May 9, 2018
Filebeat is not sending data to Elasticsearch Beats filebeat	3	2618	April 12, 2018
FileBeat not logging from a server but it is from other 6 Beats filebeat	3	945	March 22, 2017

Filebeat not connecting directly to Elasticsearch from particular machine

Related topics