Hi
I met the case on transport transport_worker
in my workflow ingest node (mem capacity 36GB) received data from source and the next this data should be relocated on data nodes (16GB)
Can we control bulk of data between nodes for avoid the overload memory under "transport_worker" ?
environment: elasticsearch 8.1.0
from: destination
{"@timestamp":"2023-02-22T21:05:17.966Z", "log.level":"ERROR", "message":"failed to clean async result [Fk93UHNqek45VDltckwwZzFpTW9ROVEgdTRkejNENERSU0tWWXFJZzRlLVFiUToy=]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es_data_ssd_3_1_ingest][transport_workerog.logger":"org.elasticsearch.xpack.core.async.DeleteAsyncResultsService","trace.id":"b5ae78782e50ff55a5e3fb93952aea31","elasticsearch.cluster.uuid":"XDEw48F5SEu3KcS3_jticsearch.node.id":"u4dz3D4DRSKVYqIg4e-QbQ","elasticsearch.node.name":"es_data_ssd_3_1_ingest","elasticsearch.cluster.name":"elk_cluster","error.type":"org.elasticsearc.RemoteTransportException","error.message":"[es_data_ssd_2_3][10.0.9.224:9300][indices:data/write/bulk[s]]","error.stack_trace":"org.elasticsearch.transport.RemoteTranson: [es_data_ssd_2_3][10.0.9.224:9300][indices:data/write/bulk[s]]\nCaused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data s:data/write/bulk[s]] would be [8474736228/7.8gb], which is larger than the limit of [8418135900/7.8gb], real usage: [8474735912/7.8gb], new bytes reserved: [316/316b],del_inference=0/0b, inflight_requests=316/316b, request=557056/544kb, fielddata=1449832688/1.3gb, eql_sequence=0/0b]\n\tat org.elasticsearch.indices.breaker.HierarchyCirService.checkParentLimit(HierarchyCircuitBreakerService.java:440)\n\tat org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMtBreaker.java:108)\n\tat org.elasticsearch.transport.InboundAggregator.checkBreaker(InboundAggregator.java:215)\n\tat org.elasticsearch.transport.InboundAggregator.finion(InboundAggregator.java:119)\n\tat org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:147)\n\tat org.elasticsearch.transport.InboundPipdleBytes(InboundPipeline.java:121)\n\tat org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:86)\n\tat org.elasticsearch.transport.netty4.NettynnelHandler.channelRead(Netty4MessageChannelHandler.java:74)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:3o.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelctChannelHandlerContext.java:357)\n\tat io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)\n\tat io.netty.channel.AbstractChannelHandlerContexnnelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat ionel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessjava:103)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerCoeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat ndler.ssl.SslHandler.unwrap(SslHandler.java:1371)\n\tat io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)\n\tat io.netty.handler.ssl.SslHandler.andler.java:1283)\n\tat io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510)\n\tat io.netty.handler.codec.ByteToMes.callDecode(ByteToMessageDecoder.java:449)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)\n\tat io.netty.channel.AbstractCerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContextn\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.cDefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.nnelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:9o.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.jtat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)etty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)\n\tat io.neternal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat
from: target gc logs
[2023-02-22T21:05:11.679+0000][7][gc ] GC(749076) Pause Full (G1 Compaction Pause) 8168M->7467M(8192M) 910.329ms
[2023-02-22T21:05:11.680+0000][7][gc,cpu ] GC(749076) User=21.93s Sys=2.28s Real=0.91s
[2023-02-22T21:05:11.680+0000][7][safepoint ] Safepoint "G1CollectForAllocation", Time since last: 126898 ns, Reaching safepoint: 3584983 ns, At safepoint: 9238356
93 ns, Total: 927420676 ns
[2023-02-22T21:05:11.682+0000][7][gc,marking ] GC(749071) Concurrent Mark From Roots 1203.205ms
[2023-02-22T21:05:11.682+0000][7][gc,marking ] GC(749071) Concurrent Mark Abort
[2023-02-22T21:05:11.682+0000][7][gc ] GC(749071) Concurrent Mark Cycle 1212.829ms
[2023-02-22T21:05:11.698+0000][7][gc,start ] GC(749077) Pause Young (Normal) (G1 Preventive Collection)
[2023-02-22T21:05:11.698+0000][7][gc,task ] GC(749077) Using 43 workers of 43 for evacuation
[2023-02-22T21:05:11.698+0000][7][gc,age ] GC(749077) Desired survivor size 27262976 bytes, new threshold 15 (max threshold 15)
[2023-02-22T21:05:11.747+0000][7][gc,age ] GC(749077) Age table with threshold 15 (max threshold 15)
[2023-02-22T21:05:11.747+0000][7][gc,age ] GC(749077) - age 1: 54319424 bytes, 54319424 total
[2023-02-22T21:05:11.747+0000][7][gc ] GC(749077) To-space exhausted
[2023-02-22T21:05:11.747+0000][7][gc,phases ] GC(749077) Pre Evacuate Collection Set: 1.2ms
[2023-02-22T21:05:11.747+0000][7][gc,phases ] GC(749077) Merge Heap Roots: 0.3ms
[2023-02-22T21:05:11.747+0000][7][gc,phases ] GC(749077) Evacuate Collection Set: 40.6ms
[2023-02-22T21:05:11.747+0000][7][gc,phases ] GC(749077) Post Evacuate Collection Set: 6.5ms
[2023-02-22T21:05:11.747+0000][7][gc,phases ] GC(749077) Other: 0.5ms
[2023-02-22T21:05:11.747+0000][7][gc,heap ] GC(749077) Eden regions: 84->0(89)
[2023-02-22T21:05:11.747+0000][7][gc,heap ] GC(749077) Survivor regions: 0->13(13)
[2023-02-22T21:05:11.747+0000][7][gc,heap ] GC(749077) Old regions: 1802->1922
[2023-02-22T21:05:11.747+0000][7][gc,heap ] GC(749077) Archive regions: 2->2
[2023-02-22T21:05:11.747+0000][7][gc,heap ] GC(749077) Humongous regions: 105->105
[2023-02-22T21:05:11.747+0000][7][gc,metaspace ] GC(749077) Metaspace: 132894K(134848K)->132894K(134848K) NonClass: 115859K(116992K)->115859K(116992K) Class: 17035K(1
7856K)->17035K(17856K)
[2023-02-22T21:05:11.747+0000][7][gc ] GC(749077) Pause Young (Normal) (G1 Preventive Collection) 7803M->7999M(8192M) 49.110ms
[2023-02-22T21:05:11.747+0000][7][gc,cpu ] GC(749077) User=0.30s Sys=0.04s Real=0.05s
[2023-02-22T21:05:11.751+0000][7][safepoint ] Safepoint "G1CollectForAllocation", Time since last: 17182249 ns, Reaching safepoint: 815173 ns, At safepoint: 525793
97 ns, Total: 53394570 ns
from: target logs
{"@timestamp":"2023-02-22T21:05:10.727Z", "log.level": "INFO", "message":"attempting to trigger G1GC due to high heap usage [8540251320]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es_data_ssd_2_3][transport_worker][T#10]","log.logger":"org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService","elasticsearch.cluster.uuid":"XDEw48F5SEu3KcS3_jDNcw","elasticsearch.node.id":"YEzEpdGpT7iECEXqVTVhDQ","elasticsearch.node.name":"es_data_ssd_2_3","elasticsearch.cluster.name":"elk_cluster"}