Elastic Search nodes spitting Unrecognized SSL message error

krishnav1 · June 8, 2022, 3:24pm

Hello, I have Elasticsearch installed along with Fluentbit and Kibana in our kubernetes cluster.
Of late, We have a heavy loss of logs from the kubernetes nodes to the elastic indices and there by Kibana.
I have been noticing the errors similar to below from fluent bit. It says to me that fluent bit is unable to push the chunks to the output system, in this case, Elasticsearch cluster.

[2022/06/08 13:13:39] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:39] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:39] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:40] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:40] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:40] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:40] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:40] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:40] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:40] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:40] [ info] [input] systemd.1 resume (mem buf overlimit)
[2022/06/08 13:13:40] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:40] [ warn] [input] systemd.1 paused (mem buf overlimit)
[2022/06/08 13:13:42] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:42] [ warn] [engine] failed to flush chunk '1-1654693979.91366988.flb', retry in 8 seconds: task_id=25, input=tail.0 > output=es.0 (out_id=0)
[2022/06/08 13:13:42] [ warn] [engine] failed to flush chunk '1-1654693940.467286805.flb', retry in 79 seconds: task_id=6, input=tail.0 > output=es.0 (out_id=0)
[2022/06/08 13:13:43] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:43] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:43] [ warn] [input] tail.0 paused (mem buf overlimit)
[2022/06/08 13:13:43] [ info] [input] tail.0 resume (mem buf overlimit)
[2022/06/08 13:13:44] [ warn] [input] tail.0 paused (mem buf overlimit)

After going through many fluent bit options and trying out(increasing size of memory buffer limit, use on disk instead of the memory and other options etc.,), nothing works out. A restart of these fluent bit pods help for couple of seconds only in resulting these tail paused messages and nothing ending up in Elasticsearch indices and there by Kibana. I also want to add that elastic data node use around 75%+ of memory (it is entitled to 4GB for each data node) almost all the time. I see that elastic data nodes have a setting of

ES_JAVA_OPTS:              -Xms2g -Xmx2g

So, turning to Elasticsearch master and data nodes, similar errors like below are logged periodically. What is funny is that very few flb chunks make it through indices(may be 10 %). That makes me think it is not actually to do with SSL certificate issue. It could be something else and I am unable to move forward in finding the reason for these below errors from Elasticsearch. My belief is that since elastic nodes are unable to process the messages from various fluent bit pods due to the below error, it just pauses indefinitely.

2022-06-08T13:11:39.570067756Z {"type": "server", "timestamp": "2022-06-08T13:11:39,568Z", "level": "WARN", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "elastic-cluster", "node.name": "elastic-cluster-es-master-2", "message": "caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=/10.233.118.18:9200, remoteAddress=/10.233.89.197:35874}", "cluster.uuid": "EQQ472KzSX6QcuQCs3jRuw", "node.id": "mmrOUNzpSMy8RyDP_wLHxg" , 
2022-06-08T13:11:39.570103748Z "stacktrace": ["io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?",
2022-06-08T13:11:39.570112079Z "at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:473) ~[netty-codec-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570127358Z "at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281) ~[netty-codec-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570133826Z "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570139755Z "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570145286Z "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570150766Z "at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570156493Z "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570162131Z "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570167811Z "at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570175198Z "at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570181085Z "at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570186779Z "at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570192321Z "at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570242831Z "at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [netty-transport-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570264241Z "at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [netty-common-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570270499Z "at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570276101Z "at java.lang.Thread.run(Thread.java:830) [?:?]",
2022-06-08T13:11:39.570281353Z "Caused by: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?",
2022-06-08T13:11:39.570286653Z "at sun.security.ssl.SSLEngineInputRecord.bytesInCompletePacket(SSLEngineInputRecord.java:146) ~[?:?]",
2022-06-08T13:11:39.570292209Z "at sun.security.ssl.SSLEngineInputRecord.bytesInCompletePacket(SSLEngineInputRecord.java:64) ~[?:?]",
2022-06-08T13:11:39.570297813Z "at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:605) ~[?:?]",
2022-06-08T13:11:39.570302918Z "at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:499) ~[?:?]",
2022-06-08T13:11:39.570312135Z "at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:475) ~[?:?]",
2022-06-08T13:11:39.570317811Z "at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:634) ~[?:?]",
2022-06-08T13:11:39.570323185Z "at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:280) ~[netty-handler-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570328785Z "at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1332) ~[netty-handler-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570334831Z "at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1227) ~[netty-handler-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570340515Z "at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1274) ~[netty-handler-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570354715Z "at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503) ~[netty-codec-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570361873Z "at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442) ~[netty-codec-4.1.43.Final.jar:4.1.43.Final]",
2022-06-08T13:11:39.570367756Z "... 16 more"] }

This is the output part of fluent bit configuration that deals with outputting data

[OUTPUT]
    Name            es
    Match           *
    Host            ${FLUENT_ELASTICSEARCH_HOST}
    Port            ${FLUENT_ELASTICSEARCH_PORT}
    Logstash_Format On
    Replace_Dots    On
    Retry_Limit     False
    tls             On
    tls.verify      Off
    HTTP_User       <>
    HTTP_Passwd    <>
    Trace_Error     On
    net.keepalive   Off

Can someone help understanding what is going wrong in my case. This was working well until 3 months ago and there are no upgrades at our end and it slowly started deteriorating to the current moment

krishnav1 · June 8, 2022, 9:09pm

Hi, can someone reply. Thanks.

stephenb · June 8, 2022, 10:57pm

Please be patient and please do not re-ping topics often / multiple times a day, this is a community forum and there are many questions and only volunteers. If you require SLA on question perhaps you should consider a commercial agreement.

javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?",
2022-06-08T13:11:39.570112079Z "

That message looks like you you are sending http traffic (plain traffic) to an https (SSL) endpoint... perhaps check that

krishnav1 · June 9, 2022, 7:34am

Hello, Thanks for the response and also insight on both response time as well as the error.

system · July 7, 2022, 7:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue when enabling TLS in Fluent-bit with Elasticsearch as output Elasticsearch docker	2	717	July 5, 2024
Problem with elasticsearch and fluentbit in Kubernetes Cluster Elasticsearch	1	1193	September 17, 2019
Strange Fluent-Bit logs Elasticsearch	3	2299	March 12, 2019
Error 400 - Rejected by Elasticsearch Elasticsearch docker	1	1155	May 8, 2023
Fluent-bit config changes to connect to xpack securitiy enabled elasticsearch running in kubernetes Elasticsearch	1	841	September 7, 2018

Elastic Search nodes spitting Unrecognized SSL message error

Related topics