Hello,
I have 10 Kubernetes clusters forward their logs to logstash VM (k8s fluentd ---> logstash port 7000) .
Logstash gets to a point where logs are being missed and source pods doing retries to get logs through . ( errors I see on this case are listed below).
Looking for recommendation to optimize logstash tcp input .
Errors on logstash
[ERROR][logstash.inputs.tcp ] xxxxxxxxxxxxxxx/x.x.x.x:16591: closing due:
java.net.SocketException: Connection reset
at sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:394) ~[?:?]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:426) ~[?:?]
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253) ~[netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132) ~[netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350) ~[netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.65.Final.jar:4.1.65.Final]
at java.lang.Thread.run(Thread.java:833) [?:?]
Current Input snippet
input {
tcp {
codec => fluent
port => 7000
tcp_keep_alive => true
}
}
Changes I tried without success
- Optimize sysctl on logstash VM
vm.max_map_count=262144
fs.file-max=65535
net.core.netdev_max_backlog=250000
net.core.netdev_budget=600
net.ipv4.tcp_mem=16777216 16777216 16777216
- Increases ring buffers
ethtool -g ens192
Ring parameters for ens192:
Pre-set maximums:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096
Current hardware settings:
RX: 4096
RX Mini: 2048
RX Jumbo: 4096
TX: 4096