All shards failed

I have been having a series of issues since the backup filled up the disks and ES ground to a halt.

Cluster is now yellow and most indexes appear to be fine others have missing shards which I am working on. Searches are failing for one application and I see this error in the logs

[2021-02-19T13:49:18,963][WARN ][r.suppressed             ] [secesprd02] path: /sessions2-210219%2Csessions2-210218/session/_search, params: {rest_total_hits_as_int=true, ignore_u
navailable=true, preference=primaries, index=sessions2-210219,sessions2-210218, type=session}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:324) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:603) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:400) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:70) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:258) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:408) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:640) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1181) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:277) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:275) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:267) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:131) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:89) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) [transport-netty4-client-7.10.0.jar:7.10.0]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.elasticsearch.tasks.TaskCancelledException: cancelled
        at org.elasticsearch.search.query.QueryPhase.lambda$executeInternal$3(QueryPhase.java:285) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.internal.ContextIndexSearcher$MutableQueryTimeout.checkCancelled(ContextIndexSearcher.java:370) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:54) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
        at org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:226) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:199) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
        at org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:341) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:296) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:148) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:372) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:431) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.SearchService.access$500(SearchService.java:141) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.search.SearchService$2.lambda$onResponse$0(SearchService.java:401) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:58) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) ~[elasticsearch-7.10.0.jar:7.10.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.10.0.jar:7.10.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]

kibana shows the index status as green.

any thoughts on what is causing this?

The search was cancelled before it completed, usually because the client closed the connection before it got a response, which itself is usually because it hit an overenthusiastic timeout.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.