Logstash and Filebeat on Kubernetes - Invalid Frame Type, received: 69

This issue matches a few others I've seen. But the difference is I've already ensured my configs do NOT have the same issues that they had:

  1. SSL is turned off explicitly on both Logstash AND filebeat/metricbeat
  2. I am absolutely using the logstash output from filebeat/metricbeat and did not leave it as the default elasticsearch.
  3. There aren't any health checks. The kube liveness/readiness probes I have turned off for now.
  4. I'm running version 7.3.2 of Logstash, ElasticSearch, Filebeat, and Metricbeat

Logstash Config

input {
  beats {
    port => 7998
  }
}
...

Filebeat/Metricbeat config

I would like to note two things:

  1. Both filebeat and metricbeat were working momentarily even with these logs, but it's off/on. Mostly off now. Results are the same for both (both broken or working at the same time)
  2. I'm using Terraform to template the configs so the configs absolutely match between filebeat/metricbeat.
filebeat.yml:
----
logging:
  json: true
filebeat.inputs:
- type: docker
  containers.ids:
  - '*'
output.logstash:
  hosts:
    - 'logstash:7998'
  ssl:
    enabled: false

Error

About every 10 seconds Logstash outputs this:

[2019-09-27T00:27:19,807][INFO ][org.logstash.beats.BeatsHandler] [local: 0.0.0.0:7998, remote: undefined] Handling exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 69
[2019-09-27T00:27:19,807][WARN ][io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
io.netty.handler.codec.DecoderException: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 69
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.30.Final.jar:4.1.30.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 69
        at org.logstash.beats.BeatsParser.decode(BeatsParser.java:92) ~[logstash-input-beats-6.0.0.jar:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        ... 8 more

Occasionally seeing Invalid Frame: 84 as well.

Logs from Filebeat/Metricbeat don't know anything besides their normal logging output.

Would also like to note I'm running everything in Kubernetes. I've double-checked the config files. I've exec'd into the containers and have made sure the filebeat.yml file matches the configs I want.

To be extra clear, I am not getting anything at all anymore.

Note that the remote address is undefined. I would guess that means the remote disconnected before logstash tried to parse the frame. That means it would be a very short lived connection that didn't care about getting a response. It could be a port scanner. It could be a health check for a load balancer (and yes, I know you said you turned health checks off).

1 Like

Thanks for the help. Quick note - I'm using this Helm chart, and deploying with Terraform. Your comment put me in the right direction.

I was temporarily mucking around with metrics and ports. I reverted those back to the defaults. I was still getting the same logs, but remote IP was showing up this time:

[2019-09-27T18:04:22,772][INFO ][org.logstash.beats.BeatsHandler] [local: 10.42.4.71:7998, remote: 10.42.5.57:34640] Handling exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 84

On the Filebeat side, I noticed that the queue was building up, but when I reverted the metrics changes then Filebeat started receiving ack's in it's logs. This was because of a bug either in the Chart or in Terraform. When I changed the ports the services didn't update properly and Kubernetes decided Logstash wasn't reachable although Logstash was running. Not sure what was sending data to Logstash at this point. A readiness check could have been it, but the ports didn't line up.

Logs from Filebeat started showing ack's:

{"level":"info","timestamp":"2019-09-27T18:07:04.674Z","logger":"monitoring","caller":"log/log.go:145","message":"Non-zero metrics in the last 30s","monitoring":{"metrics":{"beat":{"cpu":{"system":{"ticks":89810,"time":{"ms":38}},"total":{"ticks":280480,"time":{"ms":102},"value":280480},"user":{"ticks":190670,"time":{"ms":64}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":13},"info":{"ephemeral_id":"0e7b0582-1055-4663-bc9d-801b0a00112f","uptime":{"ms":64680021}},"memstats":{"gc_next":9842912,"memory_alloc":7029720,"memory_total":68295099320},"runtime":{"goroutines":46}},"filebeat":{"events":{"active":-3,"added":55,"done":58},"harvester":{"open_files":5,"running":5}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":58,"batches":17,"total":58},"read":{"bytes":102},"write":{"bytes":18518}},"pipeline":{"clients":1,"events":{"active":0,"published":55,"total":55},"queue":{"acked":58}}},"registrar":{"states":{"current":27,"update":58},"writes":{"success":17,"total":17}},"system":{"load":{"1":0.05,"15":0.33,"5":0.21,"norm":{"1":0.0063,"15":0.0413,"5":0.0263}}}}}}

At this point, I've checked the remote IP's and have confirmed they are coming from Metricbeat. Filebeat IP's aren't in there.

It also turns out that at the same time I made port changes, I also made pipeline changes, which is why logs stopped making it to Elasticsearch. Fixing the pipeline resolved that issue.

I then realized that I had turned on the Logstash module in Metricbeat and had it pointed to the Beats port in Logstash:

# metricbeat.yaml
---
100       - module: logstash
101         enabled: true
102         metricsets:
103           - node
104           - node_stats
105         period: 10s
106         hosts: ["${logstash}"]

Turning this off fixed the log noise.

Thanks for the help and pointing me in the right direction.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.