"org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed - ES 7.6

I got below error on ES 7.6 and queries were taking time to execute. Can any one help on this. ?

I have deployed our cluster on docker




{"log":"[2020-07-24T12:35:34,270][WARN ][r.suppressed             ] [cindex_nameluster-name] path: /index_name/_search, params: {typed_keys=true, ignore_unavailable=false, expand_wildcards=open, allow_no_indices=true, preference=4edb6af9-9a56-425d-94d4-a6e174efccae, ignore_throttled=true, index=index_name, search_type=dfs_query_then_fetch, batched_reduce_size=512, ccs_minimize_roundtrips=true}\n","stream":"stdout","time":"2020-07-24T12:35:34.310378193Z"}
{"log":"org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed\n","stream":"stdout","time":"2020-07-24T12:35:34.314772438Z"}
{"log":"\u0009at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:545) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314777829Z"}
{"log":"\u0009at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:306) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314781255Z"}
{"log":"\u0009at org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:216) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314784016Z"}
{"log":"\u0009at org.elasticsearch.action.search.FetchSearchPhase.lambda$innerRun$1(FetchSearchPhase.java:105) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314786868Z"}
{"log":"\u0009at org.elasticsearch.action.search.CountedCollector.countDown(CountedCollector.java:53) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.31479074Z"}
{"log":"\u0009at org.elasticsearch.action.search.CountedCollector.onFailure(CountedCollector.java:76) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314793349Z"}
{"log":"\u0009at org.elasticsearch.action.search.FetchSearchPhase$2.onFailure(FetchSearchPhase.java:183) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314796052Z"}
{"log":"\u0009at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314798577Z"}
{"log":"\u0009at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:423) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314801175Z"}
{"log":"\u0009at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1118) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314803726Z"}
{"log":"\u0009at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:244) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314806462Z"}
{"log":"\u0009at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:225) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314811448Z"}
{"log":"\u0009at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:242) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314814038Z"}
{"log":"\u0009at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:234) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.31481646Z"}
{"log":"\u0009at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:137) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314822728Z"}
{"log":"\u0009at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:103) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314825088Z"}
{"log":"\u0009at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:667) [elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314827314Z"}
{"log":"\u0009at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) [transport-netty4-client-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314829586Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314832025Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314834386Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314836726Z"}
{"log":"\u0009at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326) [netty-codec-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314839068Z"}
{"log":"\u0009at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300) [netty-codec-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314841466Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314843752Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.31484617Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314848526Z"}
{"log":"\u0009at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314851244Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314853534Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314855993Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314858332Z"}
{"log":"\u0009at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314860668Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314862994Z"}
{"log":"\u0009at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314867487Z"}
{"log":"\u0009at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314869844Z"}
{"log":"\u0009at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314872161Z"}
{"log":"\u0009at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314874498Z"}
{"log":"\u0009at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314876837Z"}
{"log":"\u0009at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.31487912Z"}
{"log":"\u0009at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [netty-transport-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314881425Z"}
{"log":"\u0009at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [netty-common-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314883683Z"}
{"log":"\u0009at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.43.Final.jar:4.1.43.Final]\n","stream":"stdout","time":"2020-07-24T12:35:34.314886291Z"}
{"log":"\u0009at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]\n","stream":"stdout","time":"2020-07-24T12:35:34.314888578Z"}
{"log":"Caused by: org.elasticsearch.ElasticsearchException$1: Task cancelled before it started: by user request\n","stream":"stdout","time":"2020-07-24T12:35:34.314890725Z"}
{"log":"\u0009at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:644) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314892965Z"}
{"log":"\u0009at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:304) ~[elasticsearch-7.6.0.jar:7.6.0]\n","stream":"stdout","time":"2020-07-24T12:35:34.314895264Z"}

Below is the actual stackTrace

@pgomulka @dadoonet @javanna - can you look into this and provide your inputs on this - which caused this ?

{"type": "server", "timestamp": "2020-07-24T12:35:27,770Z", "level": "WARN", "component": "r.suppressed", "cluster.name": “<clusterName>”, "node.name": “Node01-76", "message": "path: /<index_name>/_s
earch, params: {typed_keys=true, ignore_unavailable=false, expand_wildcards=open, allow_no_indices=true, preference=4edb6af9-9a56-425d-94d4-a6e174efccae, ignore_throttled=true, index=<index_name>, search_type=dfs
_query_then_fetch, batched_reduce_size=512, ccs_minimize_roundtrips=true}", "cluster.uuid": "FB10k4u3Qw6PpRk7ngzT_g", "node.id": "lx2Hf5JIQjeV26NewgjjKA" , 
"stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:545) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:306) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:216) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.lambda$innerRun$1(FetchSearchPhase.java:105) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.CountedCollector.countDown(CountedCollector.java:53) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:141) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:44) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.0.jar:7.6.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]",
"at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]",
"Caused by: org.elasticsearch.ElasticsearchException$1: Task cancelled before it started: by user request",
"at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:644) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:304) ~[elasticsearch-7.6.0.jar:7.6.0]",
"... 13 more",
"Caused by: java.lang.IllegalStateException: Task cancelled before it started: by user request",
"at org.elasticsearch.tasks.TaskManager.registerCancellableTask(TaskManager.java:141) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.tasks.TaskManager.register(TaskManager.java:122) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:60) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.sendLocalRequest(TransportService.java:744) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.access$100(TransportService.java:75) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService$3.sendRequest(TransportService.java:128) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:690) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:603) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:647) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:638) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.SearchTransportService.sendExecuteFetch(SearchTransportService.java:171) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.SearchTransportService.sendExecuteFetch(SearchTransportService.java:161) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.executeFetch(FetchSearchPhase.java:166) ~[elasticsearch-7.6.0.jar:7.6.0]",
"at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:148) ~[elasticsearch-7.6.0.jar:7.6.0]",
"... 9 more"] }

Please don't ping people that aren't already part of your topic like that.

What state is your cluster in, red/green?

Thanks for the reply... Sure...

the cluster was in green..

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.