Taking a snapshot causes nodes to fall out of the cluster

I've got a 22 node cluster (64GB RAM, 16 or 32 cores, 6x2TB SSD in RAID5) along with 3 master eligible nodes. I'm trying to take snapshots of a few indices and whenever I do, I see messages like:
[o.e.c.c.C.CoordinatorPublication] [graylog-phx1-0002] after [10s] publication of cluster state version [10961113] is still waiting for {elasticsearch-phx1-0002}{7D86K9qxQOuygnrXBPMpwA}{ulPGNyz7Tuu6mQnKgIEy9g}{10.10.5.52}{10.10.5.52:9300}{dir} [SENT_APPLY_COMMIT]

The nodes are not heaving loaded in terms of CPU or RAM usage. There is ~77TB of data spread across the nodes, with 2 shards per node for each index, and the indices are sized so that each shard is ~15GB. Searching through Kibana works fine, but taking a snapshot seems to overload the cluster. I'm not sure why.

What version of Elasticsearch are you using? Can you share some more logs? Not much to go on here.

1 Like

I'm using v7.10.2 (due to reasons outside of my control).
Here's something from the Master node

[2022-04-04T19:08:09,207][INFO ][o.e.s.SnapshotsService   ] [graylog-phx1-0002] snapshot [security_applogs:phx1-applogs-222-1/JBxu8WA5SBqdea17hSOK_w] started
[2022-04-04T19:08:44,650][INFO ][o.e.c.c.C.CoordinatorPublication] [graylog-phx1-0002] after [10s] publication of cluster state version [10968944] is still waiting for {elasticsearch-phx1-0005}{kDqYZkDUTYq3JUiIZb8P0g}{geFpM5FcQj2jaPOVS-8QPA}{10.10.5.55}{10.10.5.55:9300}{dir} [SENT_APPLY_COMMIT]

Followed by logs from the node that fell out of the cluster

[2022-04-04T19:09:14,667][WARN ][o.e.c.s.ClusterApplierService] [elasticsearch-phx1-0005] cluster state applier task [ApplyCommitRequest{term=237, version=10968944
, sourceNode={graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72NaA}{10.10.5.47}{10.10.5.47:9300}{m}}] took [40s] which is above the warn threshold of
[30s]: [running task [ApplyCommitRequest{term=237, version=10968944, sourceNode={graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72NaA}{10.10.5.47}{10.
10.5.47:9300}{m}}]] took [0ms], [connecting to new nodes] took [0ms], [applying settings] took [0ms], [running applier [org.elasticsearch.repositories.Repositories
Service@7f609491]] took [0ms], [running applier [org.elasticsearch.indices.cluster.IndicesClusterStateService@4ae0ae9c]] took [39984ms], [running applier [org.elas
ticsearch.script.ScriptService@1e1ffe2c]] took [0ms], [running applier [org.elasticsearch.ingest.IngestService@57c1bfa2]] took [0ms], [running applier [org.elastic
search.action.ingest.IngestActionForwarder@662ae1bb]] took [0ms], [running applier [org.elasticsearch.tasks.TaskManager@16a637de]] took [0ms], [notifying listener
[com.amazon.opendistroforelasticsearch.sql.legacy.esdomain.LocalClusterState$$Lambda$1988/0x00000017c116dc40@51a3b1a6]] took [0ms], [notifying listener [com.amazon
.opendistroforelasticsearch.security.configuration.ClusterInfoHolder@6035b510]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.alerting.ale
rts.AlertIndices@478a660f]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.alerting.core.JobSweeper@12af054]] took [0ms], [notifying listen
er [com.amazon.opendistroforelasticsearch.jobscheduler.sweeper.JobSweeper@adc3c60]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.indexman
agement.indexstatemanagement.IndexStateManagementHistory@4de900b5]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.indexmanagement.indexsta
temanagement.ManagedIndexCoordinator@26bd1a54]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.ad.indices.AnomalyDetectionIndices@5ce22475]
] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.ad.cluster.ADClusterEventListener@48e3c2d1]] took [0ms], [notifying listener [com.amazon.o
pendistroforelasticsearch.ad.cluster.MasterEventListener@587b3a7f]] took [0ms], [notifying listener [org.elasticsearch.node.ResponseCollectorService@3e7adae6]] too
k [0ms], [notifying listener [org.elasticsearch.snapshots.SnapshotShardsService@24f6b5c8]] took [0ms], [notifying listener [org.elasticsearch.indices.store.Indices
Store@5fba737c]] took [1ms], [notifying listener [org.elasticsearch.persistent.PersistentTasksNodeService@82f946e]] took [0ms], [notifying listener [com.amazon.ope
ndistroforelasticsearch.search.asynchronous.management.AsynchronousSearchManagementService@5630aa64]] took [0ms], [notifying listener [org.elasticsearch.indices.re
covery.PeerRecoverySourceService@21b54e97]] took [0ms]
[2022-04-04T19:10:43,308][WARN ][o.e.m.f.FsHealthService  ] [elasticsearch-phx1-0005] health check of [/var/lib/elasticsearch/nodes/0] took [70628ms] which is abov
e the warn threshold of [5s]
[2022-04-04T19:10:43,321][WARN ][o.e.g.PersistedClusterStateService] [elasticsearch-phx1-0005] writing cluster state took [128732ms] which is above the warn thresh
old of [10s]; wrote global metadata [false] and metadata for [1] indices and skipped [250] unchanged indices
[2022-04-04T19:12:48,542][INFO ][c.a.o.j.s.JobSweeper     ] [elasticsearch-phx1-0005] Running full sweep
[2022-04-04T19:14:25,910][INFO ][o.e.c.c.Coordinator      ] [elasticsearch-phx1-0005] master node [{graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72N
aA}{10.10.5.47}{10.10.5.47:9300}{m}] failed, restarting discovery
org.elasticsearch.ElasticsearchException: node [{graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72NaA}{10.10.5.47}{10.10.5.47:9300}{m}] failed [3] con
secutive checks
        at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler$1.handleException(LeaderChecker.java:293) ~[elasticsearch-7.10.2.jar:7.10.2]
        at com.amazon.opendistroforelasticsearch.security.transport.OpenDistroSecurityInterceptor$RestoringTransportResponseHandler.handleException(OpenDistroSecur
ityInterceptor.java:284) ~[?:?]
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1181) ~[elasticsearch-7.10.2.jar:7.10.2
]
        at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:277) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:275) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:267) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:131) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:89) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [graylog-phx1-0002][10.10.5.47:9300][internal:coordination/fault_detection/leader_check]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: rejecting leader check since [{elasticsearch-phx1-0005}{kDqYZkDUTYq3JUiIZb8P0g}{geFpM5FcQj2jaPOVS-8QPA}{10.10.5.55}{10.10.5.55:9300}{dir}] has been removed from the cluster
        at org.elasticsearch.cluster.coordination.LeaderChecker.handleLeaderCheck(LeaderChecker.java:192) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.cluster.coordination.LeaderChecker.lambda$new$0(LeaderChecker.java:113) ~[elasticsearch-7.10.2.jar:7.10.2]
        at com.amazon.opendistroforelasticsearch.security.ssl.transport.OpenDistroSecuritySSLRequestHandler.messageReceivedDecorate(OpenDistroSecuritySSLRequestHandler.java:182) ~[?:?]
        at com.amazon.opendistroforelasticsearch.security.transport.OpenDistroSecurityRequestHandler.messageReceivedDecorate(OpenDistroSecurityRequestHandler.java:
293) ~[?:?]
        at com.amazon.opendistroforelasticsearch.security.ssl.transport.OpenDistroSecuritySSLRequestHandler.messageReceived(OpenDistroSecuritySSLRequestHandler.jav
a:142) ~[?:?]
        at com.amazon.opendistroforelasticsearch.security.OpenDistroSecurityPlugin$7$1.messageReceived(OpenDistroSecurityPlugin.java:639) ~[?:?]
        at com.amazon.opendistroforelasticsearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:124)
~[?:?]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:72) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:207) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:107) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:89) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]
[2022-04-04T19:15:18,959][WARN ][o.e.m.f.FsHealthService  ] [elasticsearch-phx1-0005] health check of [/var/lib/elasticsearch/nodes/0] took [155655ms] which is abo
ve the warn threshold of [5s]
[2022-04-04T19:15:18,969][WARN ][o.e.g.PersistedClusterStateService] [elasticsearch-phx1-0005] writing cluster state took [175602ms] which is above the warn thresh
old of [10s]; wrote global metadata [false] and metadata for [1] indices and skipped [250] unchanged indices
[2022-04-04T19:15:18,974][WARN ][o.e.c.s.ClusterApplierService] [elasticsearch-phx1-0005] cluster state applier task [ApplyCommitRequest{term=237, version=10968998
, sourceNode={graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72NaA}{10.10.5.47}{10.10.5.47:9300}{m}}] took [2.9m] which is above the warn threshold of
 [30s]: [running task [ApplyCommitRequest{term=237, version=10968998, sourceNode={graylog-phx1-0002}{VoAK04_LT3a0nh4IUMszbg}{FQKSIAfiT66CfMxcS72NaA}{10.10.5.47}{10
.10.5.47:9300}{m}}]] took [0ms], [connecting to new nodes] took [0ms], [applying settings] took [0ms], [running applier [org.elasticsearch.repositories.Repositorie
sService@7f609491]] took [0ms], [running applier [org.elasticsearch.indices.cluster.IndicesClusterStateService@4ae0ae9c]] took [175462ms], [running applier [org.el
asticsearch.script.ScriptService@1e1ffe2c]] took [0ms], [running applier [org.elasticsearch.ingest.IngestService@57c1bfa2]] took [0ms], [running applier [org.elast
icsearch.action.ingest.IngestActionForwarder@662ae1bb]] took [0ms], [running applier [org.elasticsearch.tasks.TaskManager@16a637de]] took [0ms], [notifying listene
r [com.amazon.opendistroforelasticsearch.sql.legacy.esdomain.LocalClusterState$$Lambda$1988/0x00000017c116dc40@51a3b1a6]] took [0ms], [notifying listener [com.amaz
on.opendistroforelasticsearch.security.configuration.ClusterInfoHolder@6035b510]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.alerting.a
lerts.AlertIndices@478a660f]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.alerting.core.JobSweeper@12af054]] took [0ms], [notifying list
ener [com.amazon.opendistroforelasticsearch.jobscheduler.sweeper.JobSweeper@adc3c60]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.indexm
anagement.indexstatemanagement.IndexStateManagementHistory@4de900b5]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.indexmanagement.indexs
tatemanagement.ManagedIndexCoordinator@26bd1a54]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.ad.indices.AnomalyDetectionIndices@5ce2247
5]] took [0ms], [notifying listener [com.amazon.opendistroforelasticsearch.ad.cluster.ADClusterEventListener@48e3c2d1]] took [0ms], [notifying listener [com.amazon
.opendistroforelasticsearch.ad.cluster.MasterEventListener@587b3a7f]] took [0ms], [notifying listener [org.elasticsearch.node.ResponseCollectorService@3e7adae6]] took [0ms], [notifying listener [org.elasticsearch.snapshots.SnapshotShardsService@24f6b5c8]] took [0ms], [notifying listener [org.elasticsearch.indices.store.Indic
esStore@5fba737c]] took [2ms], [notifying listener [org.elasticsearch.persistent.PersistentTasksNodeService@82f946e]] took [0ms], [notifying listener [com.amazon.o
pendistroforelasticsearch.search.asynchronous.management.AsynchronousSearchManagementService@5630aa64]] took [0ms], [notifying listener [org.elasticsearch.indices.
recovery.PeerRecoverySourceService@21b54e97]] took [0ms]
[2022-04-04T19:15:23,932][WARN ][o.e.s.SnapshotShardsService] [elasticsearch-phx1-0005] [[ecs-000222][8]][security_applogs:phx1-applogs-222-1/JBxu8WA5SBqdea17hSOK_
w] failed to snapshot shard
java.lang.IllegalStateException: Unable to move the shard snapshot status to [FINALIZE]: expecting [STARTED] but got [ABORTED]
        at org.elasticsearch.index.snapshots.IndexShardSnapshotStatus.moveToFinalize(IndexShardSnapshotStatus.java:110) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$snapshotShard$64(BlobStoreRepository.java:2045) ~[elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.ListenableFuture$1.doRun(ListenableFuture.java:112) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:106) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.ListenableFuture.lambda$done$0(ListenableFuture.java:98) [elasticsearch-7.10.2.jar:7.10.2]
        at java.util.ArrayList.forEach(ArrayList.java:1540) [?:?]
        at org.elasticsearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:98) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.BaseFuture.set(BaseFuture.java:144) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.ListenableFuture.onResponse(ListenableFuture.java:127) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.action.StepListener.innerOnResponse(StepListener.java:62) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.action.NotifyOnceListener.onResponse(NotifyOnceListener.java:40) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.action.support.GroupedActionListener.onResponse(GroupedActionListener.java:66) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:89) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.executeOneFileSnapshot(BlobStoreRepository.java:2087) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$executeOneFileSnapshot$65(BlobStoreRepository.java:2092) [elasticsearch-7.10.2.jar:7
.10.2]
        at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:743) [elasticsearch-7.10.2.jar:7.10.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.2.jar:7.10.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
[2022-04-04T19:16:56,966][INFO ][o.e.m.j.JvmGcMonitorService] [elasticsearch-phx1-0005] [gc][331356] overhead, spent [292ms] collecting in the last [1s]

Ah you're using Open Distro for Elasticsearch which includes a bunch of third-party plugins. Can you reproduce this with the Elastic distribution? If not, you'll need to ask on an ODFE forum. The experts on your distribution typically don't answer questions here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.