Slow Cluster in Elastic Cloud since updating to 7.12

Hi guys,
Since we updated our cluster we noticed that it's very slow, and we didn't increase the data volume or the number of shards.

In my logs in Elastic Cloud one kind of message keeps repeating:

[instance-0000000024] sending transport message [MessageSerializer{Response{11514}{false}{false}{false}{class org.elasticsearch.action.get.GetResponse}}] of size [763] on [Netty4TcpChannel{localAddress=/X:19190, remoteAddress=/Y:36188, profile=default}] took [5106ms] which is above the warn threshold of [5000ms]

I removed the IP address (X and Y).

Thank you for any help!

The output from Nodes hot threads API:

::: {tiebreaker-0000000035}{Su8sih-mSkeLE26JRAPErw}{tSY_Ba59SiiWVXDS0E47ig}{172.26.171.180}{172.26.171.180:19946}{mv}{logical_availability_zone=tiebreaker, server_name=tiebreaker-0000000035.433e4a7f73fd4ed0a7436f398beaf0bd, availability_zone=sa-east-1c, xpack.installed=true, instance_configuration=aws.master.r4, transform.node=false, region=sa-east-1}
   Hot threads at 2021-03-30T23:22:13.848Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

::: {instance-0000000027}{8h6AA-nlT_6xhv4HQNFfwQ}{CywCOCuKSamlgm3e7TZxeg}{172.26.178.137}{172.26.178.137:19088}{ir}{logical_availability_zone=zone-1, server_name=instance-0000000027.433e4a7f73fd4ed0a7436f398beaf0bd, availability_zone=sa-east-1c, xpack.installed=true, instance_configuration=aws.coordinating.m5, transform.node=false, region=sa-east-1}
   Hot threads at 2021-03-30T23:22:13.845Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   34.0% (170ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000027][transport_worker][T#2]'
     2/10 snapshots sharing following 67 elements
       app//org.elasticsearch.common.document.DocumentField$$Lambda$5705/0x0000000801b27720.read(Unknown Source)
       app//org.elasticsearch.common.io.stream.StreamInput.readCollection(StreamInput.java:1238)
       app//org.elasticsearch.common.io.stream.StreamInput.readList(StreamInput.java:1188)
       app//org.elasticsearch.common.document.DocumentField.<init>(DocumentField.java:42)
       app//org.elasticsearch.search.SearchHit$$Lambda$5704/0x0000000801b27500.read(Unknown Source)
       app//org.elasticsearch.common.io.stream.StreamInput.readMap(StreamInput.java:640)
       app//org.elasticsearch.common.io.stream.StreamInput.readMap(StreamInput.java:624)
       app//org.elasticsearch.search.SearchHit.<init>(SearchHit.java:153)
       app//org.elasticsearch.search.SearchHits.<init>(SearchHits.java:84)
       app//org.elasticsearch.search.internal.InternalSearchResponse.<init>(InternalSearchResponse.java:42)
       app//org.elasticsearch.action.search.SearchResponse.<init>(SearchResponse.java:72)
       org.elasticsearch.xpack.core.search.action.AsyncSearchResponse$$Lambda$5671/0x0000000801b0e080.read(Unknown Source)
       app//org.elasticsearch.common.io.stream.StreamInput.readOptionalWriteable(StreamInput.java:1021)
       org.elasticsearch.xpack.core.search.action.AsyncSearchResponse.<init>(AsyncSearchResponse.java:82)
       org.elasticsearch.xpack.search.TransportGetAsyncStatusAction$$Lambda$4789/0x00000008018a88e8.read(Unknown Source)
       org.elasticsearch.xpack.core.async.AsyncTaskIndexService.decodeResponse(AsyncTaskIndexService.java:499)
       org.elasticsearch.xpack.core.async.AsyncTaskIndexService.lambda$getStatusResponseFromIndex$8(AsyncTaskIndexService.java:415)
       org.elasticsearch.xpack.core.async.AsyncTaskIndexService$$Lambda$5661/0x0000000801b0caa0.accept(Unknown Source)
       app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:117)
       app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
       app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83)
       app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:77)
       app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
       org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$2(SecurityActionFilter.java:165)
       org.elasticsearch.xpack.security.action.filter.SecurityActionFilter$$Lambda$5538/0x0000000801ae2e78.accept(Unknown Source)
       app//org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:167)
       app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:240)
       app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:231)
       app//org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1280)
       app//org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:291)
       app//org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:275)
       app//org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:128)
       app//org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:84)
       app//org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:690)
       org.elasticsearch.transport.netty4.Netty4MessageChannelHandler$$Lambda$5112/0x00000008019ba0c8.accept(Unknown Source)
       app//org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:131)
       app//org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:106)
       app//org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:71)
       org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:63)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518)
       io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
       io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
       io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
       io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
       io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
       io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
       io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
       io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
       io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
       io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
     2/10 snapshots sharing following 57 elements
       java.base@15.0.1/java.lang.StringCoding.encode8859_1(StringCoding.java:612)
       java.base@15.0.1/java.lang.StringCoding.encode8859_1(StringCoding.java:607)
       java.base@15.0.1/java.lang.StringCoding.encode(StringCoding.java:452)
       java.base@15.0.1/java.lang.String.getBytes(String.java:983)
       java.base@15.0.1/java.util.Base64$Decoder.decode(Base64.java:590)

The remaining output:

org.elasticsearch.xpack.core.async.AsyncTaskIndexService.decodeResponse(AsyncTaskIndexService.java:496)
           org.elasticsearch.xpack.core.async.AsyncTaskIndexService.lambda$getStatusResponseFromIndex$8(AsyncTaskIndexService.java:415)
           org.elasticsearch.xpack.core.async.AsyncTaskIndexService$$Lambda$5661/0x0000000801b0caa0.accept(Unknown Source)
           app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:117)
           app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
           app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83)
           app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:77)
           app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
           org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$2(SecurityActionFilter.java:165)
           org.elasticsearch.xpack.security.action.filter.SecurityActionFilter$$Lambda$5538/0x0000000801ae2e78.accept(Unknown Source)
           app//org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:167)
           app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:240)
           app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:231)
           app//org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1280)
           app//org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:291)
           app//org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:275)
           app//org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:128)
           app//org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:84)
           app//org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:690)
           org.elasticsearch.transport.netty4.Netty4MessageChannelHandler$$Lambda$5112/0x00000008019ba0c8.accept(Unknown Source)
           app//org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:131)
           app//org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:106)
           app//org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:71)
           org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:63)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518)
           io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
           io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
           io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
           io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
           io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
           io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
           io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
           io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
           io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
           io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
           io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
           io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
           java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
         4/10 snapshots sharing following 50 elements
           org.elasticsearch.xpack.core.async.AsyncTaskIndexService$$Lambda$5661/0x0000000801b0caa0.accept(Unknown Source)
           app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:117)
           app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
           app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83)
           app//org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:77)
           app//org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)
           org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$2(SecurityActionFilter.java:165)
           org.elasticsearch.xpack.security.action.filter.SecurityActionFilter$$Lambda$5538/0x0000000801ae2e78.accept(Unknown Source)
           app//org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:167)
           app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:240)
           app//org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:231)
           app//org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1280)
           app//org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:291)
           app//org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:275)
           app//org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:128)
           app//org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:84)
           app//org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:690)
           org.elasticsearch.transport.netty4.Netty4MessageChannelHandler$$Lambda$5112/0x00000008019ba0c8.accept(Unknown Source)
           app//org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:131)
           app//org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:106)
           app//org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:71)
           org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:63)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518)
           io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
           io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
           io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
           io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
           io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
           io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
           io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
           io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
           io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
           io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
           io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
           io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
           io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
           io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
           io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
           java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
         2/10 snapshots sharing following 20 elements
           io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
           io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
           io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)

Last part:

io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
               io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
               io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
               io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
               io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
               io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
               io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
     29.9% (149.6ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000027][transport_worker][T#1]'
             8/10 snapshots sharing following 24 elements
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
               io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518)
               io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
               io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
               io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
               io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
               io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
               io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
               io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
               io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
               io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
               io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
             2/10 snapshots sharing following 25 elements
               java.base@15.0.1/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)
               java.base@15.0.1/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)
               java.base@15.0.1/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637)
               io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282)
               io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372)
               io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)
               io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)
               io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
               io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
               io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
               io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
               io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
               io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
               io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
               io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
               io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
               io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)

        ::: {instance-0000000024}{ctdyx5xlTWOzgOmaq8qjsQ}{93oKMN2tS2ytEHwSMNcYHQ}{172.26.54.191}{172.26.54.191:19190}{hmrst}{logical_availability_zone=zone-0, server_name=instance-0000000024.433e4a7f73fd4ed0a7436f398beaf0bd, availability_zone=sa-east-1a, xpack.installed=true, instance_configuration=aws.data.highio.i3, transform.node=true, region=sa-east-1}
           Hot threads at 2021-03-30T23:22:13.847Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

        ::: {instance-0000000025}{aVnmVgQOS2GEO_Sb0dUNEA}{ZLIALxcCSPybk9pO_a3xYg}{172.26.70.155}{172.26.70.155:19553}{hmrst}{logical_availability_zone=zone-1, server_name=instance-0000000025.433e4a7f73fd4ed0a7436f398beaf0bd, availability_zone=sa-east-1b, xpack.installed=true, instance_configuration=aws.data.highio.i3, transform.node=true, region=sa-east-1}
           Hot threads at 2021-03-30T23:22:13.875Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
           
            4.4% (22ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000025][transport_worker][T#1]'
             2/10 snapshots sharing following 6 elements
               io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
               io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
             4/10 snapshots sharing following 9 elements
               java.base@15.0.1/sun.nio.ch.EPoll.wait(Native Method)
               java.base@15.0.1/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
               java.base@15.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:129)
               java.base@15.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:146)
               io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:803)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)
           
            1.1% (5.5ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000025][transport_worker][T#2]'
             unique snapshot
               java.base@15.0.1/sun.nio.ch.SocketDispatcher.read0(Native Method)
               java.base@15.0.1/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
               java.base@15.0.1/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
               java.base@15.0.1/sun.nio.ch.IOUtil.read(IOUtil.java:233)
               java.base@15.0.1/sun.nio.ch.IOUtil.read(IOUtil.java:223)
               java.base@15.0.1/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:389)
               org.elasticsearch.transport.CopyBytesSocketChannel.readFromSocketChannel(CopyBytesSocketChannel.java:130)
               org.elasticsearch.transport.CopyBytesSocketChannel.doReadBytes(CopyBytesSocketChannel.java:115)
               io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
               io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
               io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615)
               io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578)
               io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
               io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
               io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
               java.base@15.0.1/java.lang.Thread.run(Thread.java:832)

We aren't all guys :slight_smile:

What is the output from the _cluster/stats?pretty&human API?

You are right, Should have said people :smiley:

the output is:

{
  "status": "green",
  "cluster_name": "433e4a7f73fd4ed0a7436f398beaf0bd",
  "timestamp": 1617155285627,
  "_nodes": {
    "successful": 4,
    "failed": 0,
    "total": 4
  },
  "cluster_uuid": "g6PzQ9BdTW2nYnmiXpujmQ",
  "indices": {
    "count": 206,
    "completion": {
      "size_in_bytes": 0
    },
    "fielddata": {
      "evictions": 0,
      "memory_size_in_bytes": 3508940
    },
    "versions": [
      {
        "index_count": 13,
        "version": "7.2.0",
        "primary_shard_count": 13,
        "total_primary_bytes": 3210802845
      },
      {
        "index_count": 8,
        "version": "7.4.2",
        "primary_shard_count": 8,
        "total_primary_bytes": 14891106
      },
      {
        "index_count": 12,
        "version": "7.6.0",
        "primary_shard_count": 12,
        "total_primary_bytes": 18425777773
      },
      {
        "index_count": 7,
        "version": "7.8.0",
        "primary_shard_count": 7,
        "total_primary_bytes": 941664577
      },
      {
        "index_count": 5,
        "version": "7.8.1",
        "primary_shard_count": 5,
        "total_primary_bytes": 19080175
      },
      {
        "index_count": 10,
        "version": "7.9.0",
        "primary_shard_count": 10,
        "total_primary_bytes": 329827208
      },
      {
        "index_count": 5,
        "version": "7.9.1",
        "primary_shard_count": 5,
        "total_primary_bytes": 85317646
      },
      {
        "index_count": 5,
        "version": "7.9.2",
        "primary_shard_count": 5,
        "total_primary_bytes": 38214660
      },
      {
        "index_count": 13,
        "version": "7.9.3",
        "primary_shard_count": 13,
        "total_primary_bytes": 62468563
      },
      {
        "index_count": 13,
        "version": "7.10.0",
        "primary_shard_count": 13,
        "total_primary_bytes": 26754755
      },
      {
        "index_count": 41,
        "version": "7.10.1",
        "primary_shard_count": 41,
        "total_primary_bytes": 1240198386
      },
      {
        "index_count": 51,
        "version": "7.10.2",
        "primary_shard_count": 51,
        "total_primary_bytes": 3407187405
      },
      {
        "index_count": 23,
        "version": "7.12.0",
        "primary_shard_count": 23,
        "total_primary_bytes": 199822226
      }
    ],
    "docs": {
      "count": 153129977,
      "deleted": 297575
    },
    "segments": {
      "count": 1826,
      "max_unsafe_auto_id_timestamp": 1617136808745,
      "term_vectors_memory_in_bytes": 0,
      "version_map_memory_in_bytes": 267491,
      "norms_memory_in_bytes": 2928640,
      "stored_fields_memory_in_bytes": 1567920,
      "file_sizes": {},
      "doc_values_memory_in_bytes": 2694054,
      "fixed_bit_set_memory_in_bytes": 701920,
      "points_memory_in_bytes": 0,
      "terms_memory_in_bytes": 22423024,
      "memory_in_bytes": 29613638,
      "index_writer_memory_in_bytes": 14741808
    },
    "shards": {
      "replication": 1,
      "total": 412,
      "primaries": 206,
      "index": {
        "replication": {
          "max": 1,
          "avg": 1,
          "min": 1
        },
        "primaries": {
          "max": 1,
          "avg": 1,
          "min": 1
        },
        "shards": {
          "max": 2,
          "avg": 2,
          "min": 2
        }
      }
    },
    "analysis": {
      "built_in_char_filters": [],
      "analyzer_types": [],
      "built_in_analyzers": [],
      "built_in_filters": [],
      "filter_types": [],
      "char_filter_types": [],
      "built_in_tokenizers": [],
      "tokenizer_types": []
    },
    "query_cache": {
      "miss_count": 54793,
      "total_count": 55803,
      "evictions": 0,
      "memory_size_in_bytes": 574389,
      "hit_count": 1010,
      "cache_size": 249,
      "cache_count": 249
    },
    "store": {
      "size_in_bytes": 56053884940,
      "reserved_in_bytes": 0
    },
    "mappings": {
      "field_types": [
        {
          "count": 1,
          "index_count": 1,
          "name": "binary"
        },
        {
          "count": 469,
          "index_count": 65,
          "name": "boolean"
        },
        {
          "count": 7,
          "index_count": 7,
          "name": "byte"
        },
        {
          "count": 3,
          "index_count": 1,
          "name": "constant_keyword"
        },
        {
          "count": 800,
          "index_count": 155,
          "name": "date"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "date_nanos"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "date_range"
        },
        {
          "count": 20,
          "index_count": 5,
          "name": "double"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "double_range"
        },
        {
          "count": 257,
          "index_count": 60,
          "name": "float"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "float_range"
        },
        {
          "count": 71,
          "index_count": 11,
          "name": "geo_point"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "geo_shape"
        },
        {
          "count": 10,
          "index_count": 4,
          "name": "half_float"
        },
        {
          "count": 6,
          "index_count": 6,
          "name": "histogram"
        },
        {
          "count": 20,
          "index_count": 12,
          "name": "integer"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "integer_range"
        },
        {
          "count": 133,
          "index_count": 13,
          "name": "ip"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "ip_range"
        },
        {
          "count": 10566,
          "index_count": 155,
          "name": "keyword"
        },
        {
          "count": 1791,
          "index_count": 115,
          "name": "long"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "long_range"
        },
        {
          "count": 71,
          "index_count": 43,
          "name": "nested"
        },
        {
          "count": 6022,
          "index_count": 111,
          "name": "object"
        },
        {
          "count": 42,
          "index_count": 6,
          "name": "scaled_float"
        },
        {
          "count": 1,
          "index_count": 1,
          "name": "shape"
        },
        {
          "count": 2,
          "index_count": 2,
          "name": "short"
        },
        {
          "count": 4143,
          "index_count": 145,
          "name": "text"
        }
      ]
    }

The rest of the output:

   },
      "nodes": {
        "count": {
          "data_warm": 0,
          "data_hot": 2,
          "coordinating_only": 0,
          "data": 0,
          "data_frozen": 0,
          "transform": 2,
          "ingest": 1,
          "master": 3,
          "voting_only": 1,
          "ml": 0,
          "remote_cluster_client": 3,
          "total": 4,
          "data_cold": 0,
          "data_content": 2
        },
        "fs": {
          "free_in_bytes": 514992394240,
          "total_in_bytes": 571230650368,
          "available_in_bytes": 514992394240
        },
        "versions": [
          "7.12.0"
        ],
        "process": {
          "open_file_descriptors": {
            "max": 1870,
            "avg": 1296,
            "min": 691
          },
          "cpu": {
            "percent": 14
          }
        },
        "network_types": {
          "transport_types": {
            "security4": 4
          },
          "http_types": {
            "security4": 4
          }
        },
        "ingest": {
          "number_of_pipelines": 26,
          "processor_stats": {
            "rename": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "grok": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "convert": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "set": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "script": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "enrich": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "conditional": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "pipeline": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "remove": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "gsub": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "user_agent": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "geoip": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "date": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            },
            "csv": {
              "count": 0,
              "failed": 0,
              "current": 0,
              "time_in_millis": 0
            }
          }
        },
        "packaging_types": [
          {
            "count": 4,
            "flavor": "default",
            "type": "docker"
          }
        ],
        "discovery_types": {
          "zen": 4
        },
        "jvm": {
          "mem": {
            "heap_used_in_bytes": 7788549736,
            "heap_max_in_bytes": 11383341056
          },
          "threads": 337,
          "max_uptime_in_millis": 19326165,
          "versions": [
            {
              "vm_name": "OpenJDK 64-Bit Server VM",
              "count": 4,
              "vm_version": "15.0.1+9",
              "using_bundled_jdk": true,
              "bundled_jdk": true,
              "version": "15.0.1",
              "vm_vendor": "AdoptOpenJDK"
            }
          ]
        },
        "plugins": [
          {
            "has_native_controller": false,
            "description": "The S3 repository plugin adds S3 repositories",
            "java_version": "1.8",
            "licensed": false,
            "classname": "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
            "version": "7.12.0",
            "elasticsearch_version": "7.12.0",
            "extended_plugins": [],
            "type": "isolated",
            "name": "repository-s3"
          },
          {
            "has_native_controller": false,
            "description": "A bootstrap plugin that adds support for interfacing with filesystem that enforce user quotas.",
            "java_opts": "-Djava.nio.file.spi.DefaultFileSystemProvider=org.elasticsearch.fs.quotaaware.QuotaAwareFileSystemProvider",
            "java_version": "1.8",
            "licensed": true,
            "classname": "",
            "version": "7.12.0",
            "elasticsearch_version": "7.12.0",
            "extended_plugins": [],
            "type": "bootstrap",
            "name": "quota-aware-fs"
          }
        ],
        "os": {
          "pretty_names": [
            {
              "count": 4,
              "pretty_name": "CentOS Linux 8"
            }
          ],
          "mem": {
            "free_in_bytes": 2756349952,
            "free_percent": 12,
            "used_in_bytes": 19792228352,
            "total_in_bytes": 22548578304,
            "used_percent": 88
          },
          "allocated_processors": 8,
          "architectures": [
            {
              "count": 4,
              "arch": "amd64"
            }
          ],
          "available_processors": 8,
          "names": [
            {
              "count": 4,
              "name": "Linux"
            }
          ]
        }
      }
    }

Thanks, nothing jumps out at me immediately. Is there anything in Monitoring that might correlate with these log messages, any spikes etc?

Yes, we saw a big spike in every metric in the performance section. The things that we did in the past few days, and were both in the same day, last friday:
1 - was breaking a logstash we had that sent data from mysql and mongodb into two logstashs, one for each database.
2 - updating to 7.12

Every metric, like GC, CPU and disk use?

@warkolm just took this print:

The log is full with the same message, even with the load in the cluster very low (it's 7AM, nobody is using our cluster basically, and the number of indexed documents is also very low).

About every second:

[instance-0000000024] sending transport message [MessageSerializer{Response{30185}{false}{false}{false}{class org.elasticsearch.action.get.GetResponse}}] of size [647] on [Netty4TcpChannel{localAddress=/172.17.0.10:19190, remoteAddress=/172.26.46.202:47342, profile=default}] took [6004ms] which is above the warn threshold of [5000ms]

And sometimes it appears too for all our instances:

[instance-0000000024] GC did bring memory usage down, before [4091706208], after [2293931872], allocations [45], duration [145]

What platform are you using (AWS/GCP/...) and in what region is this cluster?

Hi @DavidTurner we are using Elastic Cloud in AWS. We are in Sao Paulo (sa-east-1).

Thanks. Your cluster is out of CPU credits which normally means it's too small for the workload. Most of the load seems to be due to hitting the _async_search/status endpoint very hard, multiple times per second.

@DavidTurner but yesterday we doubled the size of the cluster, why we didn't get enough new cpu credits?
When will we be able to recover this cpu credits?
The cluster was working flawlessly.

On closer inspection it seems the CPU credits aren't even being reported. I suggest opening a support case to investigate why that might be.

The credits are replenished over time, but if the cluster is constantly overloaded then they will be consumed immediately.

@DavidTurner is there anything that could speed up the help by the support?

We have seen that the credits are not replenishing indeed. I opened the ticket 3 hours ago.
We depend on this service and in about 1 hour we will have big load in the cluster.

Thank you

The response time for the proper support team to respond is determined by the severity level of the case (and your subscription level ofc).

I don't have any better ideas than to suggest working out what is generating all the load on the cluster and getting it to stop. If you can't pin it down you could at least change the password so that the requests will be rejected.

I have the basic subscription, opened the ticket about 10 hours ago. Not even a clue when we will receive a bit of help.
I can't even upgrade the severity level (is there some way?). I'm really sad about this situation.

But thanks anyway David

Greetings,

We are getting a similar issue with an upgraded cluster to 7.12. This one is on premise. I do not want to highjack your thread so the highlights are:

  • 3 nodes cluster
  • 2 of the nodes are pushing constantly 200-300 Mbps to the other node. (This is new and way higher than usual, usually around 20 Mbps)
  • The receiving node contains Kibana, might be related. It is not the elected master.
  • The node problematic node is hitting 100% cpu constantly.
  • The other 2 nodes cpu consumption are fine (on trend), based on last month.

The behavior started with 7.12, was 7.10 last week.

Could not figure what is pushing so much data to the Kibana node. This cluster is doing log indexing, it is not frequently used for search operations.

Looking at active tasks, I can see 400+ instance of "cluster:monitor/async_search/status".

If you feel the issue is similar I can provide more information.

Thanks!

1 Like