Netty4TcpChannel Negative longs unsupported - repeated in logs

Hello, after cluster update to 7.6.0 I have these messages repeated in plain.log.

[2020-02-13T15:56:52,878][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56542}]
java.lang.IllegalStateException: Negative longs unsupported, use writeLong or writeZLong for negative numbers [-84034]
        at org.elasticsearch.common.io.stream.StreamOutput.writeVLong(StreamOutput.java:299) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.ingest.IngestStats$Stats.writeTo(IngestStats.java:197) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.ingest.IngestStats.writeTo(IngestStats.java:103) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.common.io.stream.StreamOutput.writeOptionalWriteable(StreamOutput.java:897) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.action.admin.cluster.node.stats.NodeStats.writeTo(NodeStats.java:255) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundMessage.writeMessage(OutboundMessage.java:87) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundMessage.serialize(OutboundMessage.java:64) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler$MessageSerializer.get(OutboundHandler.java:166) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler$MessageSerializer.get(OutboundHandler.java:152) ~[elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler$SendContext.get(OutboundHandler.java:199) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler.internalSend(OutboundHandler.java:129) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler.sendMessage(OutboundHandler.java:124) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.OutboundHandler.sendResponse(OutboundHandler.java:104) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.TcpTransportChannel.sendResponse(TcpTransportChannel.java:64) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:244) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:240) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:257) [x-pack-security-7.6.0.jar:7.6.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:315) [x-pack-security-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:264) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.0.jar:7.6.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.0.jar:7.6.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]

Any idea, what is wrong?

It is a three node cluster, with basic configuration. Both non-elected masters are reporting this in their logs. The direction is from them to the elected master.

....

[2020-02-13T16:04:22,617][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56548}]
[2020-02-13T16:04:22,895][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56548}]
[2020-02-13T16:04:24,563][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56588}]
[2020-02-13T16:04:24,811][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56538}]
[2020-02-13T16:04:32,580][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56548}]
[2020-02-13T16:04:32,822][WARN ][o.e.t.OutboundHandler    ] [ela3] send message failed [channel: Netty4TcpChannel{localAddress=/A.B.C.95:9300, remoteAddress=/A.B.C.93:56588}]

...

1 Like

Hi @Fangy

could you share the ingest pipelines you're using. This looks like it may be a bug with the ingest metrics. Would be great to have the concrete pipeline to try and reproduce.

Thanks!

Hi @Armin_Braun
If you talk about GET /_ingest/pipeline I think I did not touch them by myself. All created by beats. I am using logstash for parsing. Anyway, here you are: pipeline

@Fangy

that link leads to a 404 I'm afraid but it's good to have the information that you're using standard beats/LS pipelines as well. Are you seeing any ingest related errors/warnings other than this one in the logs?

@Armin_Braun
Yes, I can confirm this, but I didn't notice until now...
From one on nodes with errors:

cat elk.log | grep 2020-02-13 | grep -c "send mess"
3130
cat elk.log | grep 2020-02-13 | grep -v -c "send mess"
27158

lot of something :smiley:

cat elk.log | grep 2020-02-13 | grep -c "failed to execute pipeline"
3242

[2020-02-13T17:53:31,229][DEBUG][o.e.a.b.T.BulkRequestModifier] [ela1] failed to execute pipeline [_none] for document [filebeat-7.6.0/_doc/null]
this message make sense for me, because of the different Cisco ASA time format in the syslog message. Different from ingest pipeline.

cat elk.log | grep 2020-02-13 | grep -c "failed to execute on node"
20531

[2020-02-13T16:18:42,582][DEBUG][o.e.a.a.c.s.TransportClusterStatsAction] [ela1] failed to execute on node [I38_0BvCR2mxMxsin3r2Ug]
[2020-02-13T16:18:42,601][DEBUG][o.e.a.a.c.s.TransportClusterStatsAction] [ela1] failed to execute on node [OGKWdMK0SmOaTaelB1a86w]
[2020-02-13T16:18:42,837][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [ela1] failed to execute on node [OGKWdMK0SmOaTaelB1a86w]
[2020-02-13T16:18:42,837][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [ela1] failed to execute on node [I38_0BvCR2mxMxsin3r2Ug]

cat elk.log | grep 2020-02-13 | grep -c "failure parsing pipeline config"
2233

[2020-02-13T18:02:03,005][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.0-cisco-asa-asa-ftd-pipeline]
[2020-02-13T18:02:03,024][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.0-cisco-ftd-asa-ftd-pipeline]
[2020-02-13T18:02:03,040][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.2-cisco-asa-asa-ftd-pipeline]
[2020-02-13T18:02:03,050][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.0-elasticsearch-slowlog-pipeline]
[2020-02-13T18:02:03,058][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.2-elasticsearch-audit-pipeline-json]
[2020-02-13T18:02:03,109][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.6.0-cisco-ftd-asa-ftd-pipeline]
[2020-02-13T18:02:03,114][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.6.0-elasticsearch-audit-pipeline-json]
[2020-02-13T18:02:03,135][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.2-cisco-ftd-asa-ftd-pipeline]
[2020-02-13T18:02:03,148][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.2-elasticsearch-slowlog-pipeline]
[2020-02-13T18:02:03,169][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.6.0-elasticsearch-slowlog-pipeline]
[2020-02-13T18:02:03,176][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.5.0-elasticsearch-audit-pipeline-json]
[2020-02-13T18:02:03,956][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [filebeat-7.6.0-cisco-asa-asa-ftd-pipeline]
[2020-02-13T18:02:03,961][WARN ][o.e.x.m.i.i.I.Factory    ] [ela1] failure parsing pipeline config [xpack_monitoring_6]

and these in blocks as you can see...

About the link, I double checked it. It should work. pipeline

Hmmm, interesting, if it is something with ingest pipelines.... can you download it now? Or tell me, where to put it. It is about 500KB the file.

and yesterday:

zcat elk-2020-02-12-1.log.gz | grep 2020-02-12 | grep -c "send mess"
3957
zcat elk-2020-02-12-1.log.gz | grep 2020-02-12 | grep -c "failure parsing pipeline config"
1234
zcat elk-2020-02-12-1.log.gz | grep 2020-02-12 | grep -c "failed to execute on node"
45
zcat elk-2020-02-12-1.log.gz | grep 2020-02-12 | grep -c "failed to execute pipeline"
2908
zcat elk-2020-02-12-1.log.gz | grep 2020-02-12 | grep "send mess" | head
[2020-02-12T20:23:21,159][WARN ][o.e.t.OutboundHandler    ] [ela1] send message failed [channel: Netty4TcpChannel{localAddress=/10.20.132.93:9300, remoteAddress=/10.20.132.94:42874}]
[2020-02-12T20:24:55,403][WARN ][o.e.t.OutboundHandler    ] [ela1] send message failed [channel: Netty4TcpChannel{localAddress=/10.20.132.93:9300, remoteAddress=/10.20.132.94:43194}]
[2020-02-12T20:24:58,189][WARN ][o.e.t.OutboundHandler    ] [ela1] send message failed [channel: Netty4TcpChannel{localAddress=/10.20.132.93:9300, remoteAddress=/10.20.132.94:43190}]
[2020-02-12T20:25:05,159][WARN ][o.e.t.OutboundHandler    ] [ela1] send message failed [channel: Netty4TcpChannel{localAddress=/10.20.132.93:9300, remoteAddress=/10.20.132.94:43192}]

right after upgrade... of course it can be something that starts with elastic restart too

Thanks @Fangy,

I was able to download the pipeline now. I opened https://github.com/elastic/elasticsearch/issues/52339 so that this can be investigated by the appropriate team as a bug with some notes on where this is probably coming from.

In the meantime, I would suggest to try and maybe remove the existing pipelines and let Logstash and Beats recreate their pipelines. My best guess here would be, that maybe something went wrong with the upgrade and the pre-7.6 pipelines are broken now and as a side effect revealing the bug in the metrics.

1 Like

@Fangy, we've committed a fix for this bug here: https://github.com/elastic/elasticsearch/pull/52543

Thank you for reporting it and providing us with the diagnostic info to fix it.

1 Like