Is "http://localhost:9200/_all/_stats/_all" an expensive call?

We are using open-telemetry to monitor the es cluster. And I found the open-telemetry receiver is sending request to "http://localhost:9200/_all/_stats/_all" to pull the metrics. One issue I notice is if the receiver is sending this request every 5-10 seconds. The cpu usage of each data node will all get to 200%. Later on, each data node will slowly loose the connection to master node and eventually disconnected.

Is this expected? Any suggestion/recommendation is highly appreciated

Roy

This is a sample hot_thread from one of those data nodes

::: {my-data-node-0}{yzri61eCQcq1jO5HttyZkw}{eL2Xk19-TfGroghH_AVnNQ}{10.112.42.39}{10.112.42.39:9300}{availability_zone=us-west-2b, ml.machine_memory=8589934592, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
   Hot threads at 2023-02-19T10:01:18.754Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   41.5% (207.6ms out of 500ms) cpu usage by thread 'elasticsearch[my-data-node-0][management][T#2]'
     6/10 snapshots sharing following 35 elements
       java.io.UnixFileSystem.canonicalize0(Native Method)
       java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:177)
       java.io.File.getCanonicalPath(File.java:626)
       java.io.FilePermission$1.run(FilePermission.java:248)
       java.io.FilePermission$1.run(FilePermission.java:236)
       java.security.AccessController.doPrivileged(Native Method)
       java.io.FilePermission.init(FilePermission.java:236)
       java.io.FilePermission.<init>(FilePermission.java:310)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.Translog.findEarliestLastModifiedAge(Translog.java:436)
       org.elasticsearch.index.translog.Translog.earliestLastModifiedAge(Translog.java:424)
       org.elasticsearch.index.translog.Translog.stats(Translog.java:862)
       org.elasticsearch.index.engine.InternalEngine.getTranslogStats(InternalEngine.java:551)
       org.elasticsearch.index.shard.IndexShard.translogStats(IndexShard.java:1067)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:216)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:185)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:49)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:436)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:414)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:401)
       org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
       org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:778)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
     3/10 snapshots sharing following 28 elements
       sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
       sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:291)
       sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:52)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.Translog.findEarliestLastModifiedAge(Translog.java:436)
       org.elasticsearch.index.translog.Translog.earliestLastModifiedAge(Translog.java:424)
       org.elasticsearch.index.translog.Translog.stats(Translog.java:862)
       org.elasticsearch.index.engine.InternalEngine.getTranslogStats(InternalEngine.java:551)
       org.elasticsearch.index.shard.IndexShard.translogStats(IndexShard.java:1067)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:216)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:185)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:49)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:436)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:414)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:401)
       org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
       org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:778)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
     unique snapshot
       java.nio.CharBuffer.wrap(CharBuffer.java:373)
       java.nio.CharBuffer.wrap(CharBuffer.java:396)
       sun.nio.fs.UnixPath.encode(UnixPath.java:136)
       sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
       sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
       java.io.FilePermission.init(FilePermission.java:228)
       java.io.FilePermission.<init>(FilePermission.java:310)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.Translog.findEarliestLastModifiedAge(Translog.java:436)
       org.elasticsearch.index.translog.Translog.earliestLastModifiedAge(Translog.java:424)
       org.elasticsearch.index.translog.Translog.stats(Translog.java:862)
       org.elasticsearch.index.engine.InternalEngine.getTranslogStats(InternalEngine.java:551)
       org.elasticsearch.index.shard.IndexShard.translogStats(IndexShard.java:1067)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:216)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:185)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:49)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:436)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:414)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:401)
       org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
       org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:778)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
   
   39.3% (196.3ms out of 500ms) cpu usage by thread 'elasticsearch[my-data-node-0][management][T#1]'
     9/10 snapshots sharing following 35 elements
       java.io.UnixFileSystem.canonicalize0(Native Method)
       java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:177)
       java.io.File.getCanonicalPath(File.java:626)
       java.io.FilePermission$1.run(FilePermission.java:248)
       java.io.FilePermission$1.run(FilePermission.java:236)
       java.security.AccessController.doPrivileged(Native Method)
       java.io.FilePermission.init(FilePermission.java:236)
       java.io.FilePermission.<init>(FilePermission.java:310)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.Translog.findEarliestLastModifiedAge(Translog.java:436)
       org.elasticsearch.index.translog.Translog.earliestLastModifiedAge(Translog.java:424)
       org.elasticsearch.index.translog.Translog.stats(Translog.java:862)
       org.elasticsearch.index.engine.InternalEngine.getTranslogStats(InternalEngine.java:551)
       org.elasticsearch.index.shard.IndexShard.translogStats(IndexShard.java:1067)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:216)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:185)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:49)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:436)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:414)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:401)
       org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
       org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:778)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
     unique snapshot
       java.security.AccessController.getStackAccessControlContext(Native Method)
       java.security.AccessController.checkPermission(AccessController.java:860)
       java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.Translog.findEarliestLastModifiedAge(Translog.java:436)
       org.elasticsearch.index.translog.Translog.earliestLastModifiedAge(Translog.java:424)
       org.elasticsearch.index.translog.Translog.stats(Translog.java:862)
       org.elasticsearch.index.engine.InternalEngine.getTranslogStats(InternalEngine.java:551)
       org.elasticsearch.index.shard.IndexShard.translogStats(IndexShard.java:1067)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:216)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:185)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:49)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:436)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:414)
       org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:401)
       org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
       org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:778)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
   
   34.6% (173.1ms out of 500ms) cpu usage by thread 'elasticsearch[my-data-node-0][generic][T#5]'
     5/10 snapshots sharing following 29 elements
       java.io.UnixFileSystem.canonicalize0(Native Method)
       java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:177)
       java.io.File.getCanonicalPath(File.java:626)
       java.io.FilePermission$1.run(FilePermission.java:248)
       java.io.FilePermission$1.run(FilePermission.java:236)
       java.security.AccessController.doPrivileged(Native Method)
       java.io.FilePermission.init(FilePermission.java:236)
       java.io.FilePermission.<init>(FilePermission.java:310)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.getMinTranslogGenByAge(TranslogDeletionPolicy.java:197)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.minTranslogGenRequired(TranslogDeletionPolicy.java:165)
       org.elasticsearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1732)
       org.elasticsearch.index.engine.InternalEngine.trimUnreferencedTranslogFiles(InternalEngine.java:1915)
       org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1130)
       org.elasticsearch.index.IndexService.maybeTrimTranslog(IndexService.java:765)
       org.elasticsearch.index.IndexService.access$600(IndexService.java:100)
       org.elasticsearch.index.IndexService$AsyncTrimTranslogTask.runInternal(IndexService.java:894)
       org.elasticsearch.common.util.concurrent.AbstractAsyncTask.run(AbstractAsyncTask.java:144)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:708)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
     3/10 snapshots sharing following 22 elements
       sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
       sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:291)
       sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:52)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.getMinTranslogGenByAge(TranslogDeletionPolicy.java:197)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.minTranslogGenRequired(TranslogDeletionPolicy.java:165)
       org.elasticsearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1732)
       org.elasticsearch.index.engine.InternalEngine.trimUnreferencedTranslogFiles(InternalEngine.java:1915)
       org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1130)
       org.elasticsearch.index.IndexService.maybeTrimTranslog(IndexService.java:765)
       org.elasticsearch.index.IndexService.access$600(IndexService.java:100)
       org.elasticsearch.index.IndexService$AsyncTrimTranslogTask.runInternal(IndexService.java:894)
       org.elasticsearch.common.util.concurrent.AbstractAsyncTask.run(AbstractAsyncTask.java:144)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:708)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)
     2/10 snapshots sharing following 27 elements
       java.io.File.getCanonicalPath(File.java:626)
       java.io.FilePermission$1.run(FilePermission.java:248)
       java.io.FilePermission$1.run(FilePermission.java:236)
       java.security.AccessController.doPrivileged(Native Method)
       java.io.FilePermission.init(FilePermission.java:236)
       java.io.FilePermission.<init>(FilePermission.java:310)
       java.lang.SecurityManager.checkRead(SecurityManager.java:888)
       sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
       sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49)
       sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
       sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
       java.nio.file.Files.readAttributes(Files.java:1737)
       java.nio.file.Files.getLastModifiedTime(Files.java:2266)
       org.elasticsearch.index.translog.BaseTranslogReader.getLastModifiedTime(BaseTranslogReader.java:147)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.getMinTranslogGenByAge(TranslogDeletionPolicy.java:197)
       org.elasticsearch.index.translog.TranslogDeletionPolicy.minTranslogGenRequired(TranslogDeletionPolicy.java:165)
       org.elasticsearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1732)
       org.elasticsearch.index.engine.InternalEngine.trimUnreferencedTranslogFiles(InternalEngine.java:1915)
       org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1130)
       org.elasticsearch.index.IndexService.maybeTrimTranslog(IndexService.java:765)
       org.elasticsearch.index.IndexService.access$600(IndexService.java:100)
       org.elasticsearch.index.IndexService$AsyncTrimTranslogTask.runInternal(IndexService.java:894)
       org.elasticsearch.common.util.concurrent.AbstractAsyncTask.run(AbstractAsyncTask.java:144)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:708)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       java.lang.Thread.run(Thread.java:750)

This API can be a little expensive sometimes, depending on various things - not least on the version you're using. What version is it?

Later on, each data node will slowly loose the connection to master node and eventually disconnected.

This is not expected tho.

Thanks for replying David.
The version is 6.8.23.

Btw, what does this hot thread output mean?

The other thing I notice is we are using envoy to handle the mtls communications for both port 9200 and 9300 in all nodes. Will this cause any issue?

Oh wow, that's very old. You're missing out on literally years of bug fixes and performance improvements, some of which relate to this issue. You need to upgrade to a supported version ASAP.

I doubt it relates to the issue you're asking about here, although all supported versions handle TLS themselves so that's another good reason to upgrade.

Thanks for the suggestion David, I will bring it back to the team. Any good version you will recommend?
And another quick question, it seems like the opentelemetry sends the following 3 different requests(2 more). Are those 2 expensive calls too? Will upgrading to the later version of es help for all 3 cases?

I'd recommend the latest version, which is currently 8.6.2, but any supported version is better than where you are now. See Elastic Product End of Life Dates | Elastic for more details.

I don't recall the details of the performance characteristics of 6.8, but there's not much point in thinking about that. Upgrading to a supported version needs to be your first priority.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.