I have a setup with several dozens of filebeats (v.6.2.2) that are connected to Elastic (v.6.2.2 too).
Each filebeat monitors for several dozens of log files.
Some of the beats stop functioning properly sometimes, I can see the following in filebeat log:
|2018-10-03T12:41:23.282+0100|INFO|elasticsearch/client.go:690|Connected to Elasticsearch version 6.2.2|
|---|---|---|---|
|2018-10-03T12:41:23.282+0100|INFO|template/load.go:73|Template already exists and will not be overwritten.|
|2018-10-03T12:41:24.086+0100|ERROR|elasticsearch/client.go:299|Failed to perform any bulk index operations: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:64229->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
|2018-10-03T12:41:25.124+0100|ERROR|pipeline/output.go:92|Failed to publish events: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:64229->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
|2018-10-03T12:41:25.124+0100|INFO|elasticsearch/client.go:690|Connected to Elasticsearch version 6.2.2|
|2018-10-03T12:41:25.124+0100|INFO|template/load.go:73|Template already exists and will not be overwritten.|
|2018-10-03T12:41:25.947+0100|ERROR|elasticsearch/client.go:299|Failed to perform any bulk index operations: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:64231->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
|2018-10-03T12:41:26.987+0100|ERROR|pipeline/output.go:92|Failed to publish events: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:64231->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
There are no errors in Elastic log at this time.
The filebeat starts to process logs again only after restart.
Could you please help me with the issue? It is rather annoying to find out that some data is missing and then finding out which filebeat is stuck and then restarting it...
Which operating system are you using? I'd suggest to update Filebeat to 6.4 anyways, just in case. We had some known errors in Windows OS as you can see here https://github.com/elastic/beats/issues/905
Hi @Mario_Castro,
I'm using Windows Server 2008R2 for filebeats, Windows Server R12 for Elastic.
I have upgraded all the stuff to 6.4.2, will keep an eye on filebeats.
Hi @Mario_Castro,
The same problem with version 6.4.2:
In filebeat logs:
|2018-10-05T15:59:17.617+0100|INFO|pipeline/output.go:95|Connecting to backoff(elasticsearch(http://lonasdpricu2:9200))|
|2018-10-05T15:59:17.617+0100|INFO|elasticsearch/client.go:712|Connected to Elasticsearch version 6.4.2|
|2018-10-05T15:59:17.617+0100|INFO|template/load.go:129|Template already exists and will not be overwritten.|
|2018-10-05T15:59:17.618+0100|INFO|pipeline/output.go:105|Connection to backoff(elasticsearch(http://lonasdpricu2:9200)) established|
|2018-10-05T15:59:18.603+0100|ERROR|elasticsearch/client.go:317|Failed to perform any bulk index operations: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:56512->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
|2018-10-05T15:59:19.605+0100|ERROR|pipeline/output.go:121|Failed to publish events: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:56512->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
|2018-10-05T15:59:19.605+0100|INFO|pipeline/output.go:95|Connecting to backoff(elasticsearch(http://lonasdpricu2:9200))|
|2018-10-05T15:59:19.605+0100|INFO|elasticsearch/client.go:712|Connected to Elasticsearch version 6.4.2|
|2018-10-05T15:59:19.605+0100|INFO|template/load.go:129|Template already exists and will not be overwritten.|
|2018-10-05T15:59:19.621+0100|INFO|pipeline/output.go:105|Connection to backoff(elasticsearch(http://lonasdpricu2:9200)) established|
|2018-10-05T15:59:20.454+0100|ERROR|elasticsearch/client.go:317|Failed to perform any bulk index operations: Post http://lonasdpricu2:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp xx.xx.xx.xx:56513->xx.xx.xx.xx:9200: wsasend: An existing connection was forcibly closed by the remote host.|
The situation got even worse: I cannot restore some of the filebeats by restarting them.
And there are no errors on ES side.
Maybe there is some restriction on number of connections to Elastic?
At the same time, check the permissions of the files, specially the registry files that filebeat uses. I don't think that you're having problems with it, but just to check.
Confirm us that you aren't using logstash in the middle, just to double check that we shouldn't look there for issues.
Hi @Mario_Castro,
The mentioned problem doesn't look similar to me - I don't use LogStash in the middle.
I have switched on traces on Elastic and found the following:
[2018-10-05T17:39:41,385][DEBUG][r.suppressed ] path: /bad-request, params: {}
java.lang.IllegalArgumentException: text is empty (possibly HTTP/0.9)
at io.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:63) ~[netty-codec-http-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:87) ~[netty-codec-http-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.http.HttpObjectDecoder.decode(HttpObjectDecoder.java:219) ~[netty-codec-http-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) [netty-handler-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
[2018-10-05T17:39:41,385][TRACE][o.e.i.b.request ] [request] Adjusted breaker by [16440] bytes, now [16440]
[2018-10-05T17:39:41,385][TRACE][o.e.i.b.request ] [request] Adjusted breaker by [-16440] bytes, now [0]
[2018-10-05T17:39:41,385][TRACE][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [lonasdpricu2] channel closed: [id: 0x2add648b, L:/xx.xx.xx.xx:9200 ! R:/xx.xx.xx.xx:34900]
I have found out that this happens when my filebeat instance reads several dozens of files at startup with possibly big amount of data to be passed to Elastic.
Uhmmm, "text is empty" message really confuses me but according to your description, filebeat backpressure system should handle potentially large files.
@Mario_Castro, I suppose that the problem is in size of the messages. I try to handle the files with JSON records that may be of 100 MB in size. This big amount of data is produced in burst usually.
I have tried to increase the parameter max_bytes, but I cannot predict the exact number of monitored files (15-20 or even more) and the upper value for the parameter would be rather big.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.