When I restart elasticsearch the cluster will go into green for a little while then it will go back to read and kibana will no longer come up. Instead I see this error " {"message":"all shards failed: [search_phase_execution_exception] all shards failed","statusCode":503,"error":"Service Unavailable"} " Im not sure why this is happening ... it keeps crashing the over and over restarts dont last long.
I see data to large and shards failing but How do I fix this issue
[2020-04-22T15:53:54,901][DEBUG][o.e.a.s.TransportSearchAction] [atl-cla-deves01] All shards failed for phase: [query]
[2020-04-22T15:53:54,902][WARN ][r.suppressed ] [atl-cla-deves01] path: /.kibana_task_manager/_search, params: {ignore_unavailable=true, index=.kibana_task_manager}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:305) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:139) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:264) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.search.InitialSearchPhase.onShardFailure(InitialSearchPhase.java:105) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.search.InitialSearchPhase.lambda$performPhaseOnShard$1(InitialSearchPhase.java:251) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.search.InitialSearchPhase$1.doRun(InitialSearchPhase.java:172) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.3.0.jar:7.3.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
[2020-04-22T15:53:56,175][WARN ][r.suppressed ] [atl-cla-deves01] path: /.kibana/_doc/space%3Adefault, params: {index=.kibana, id=space:default}
org.elasticsearch.action.NoShardAvailableActionException: No shard available for [get [.kibana][_doc][space:default]: routing [null]]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.perform(TransportSingleShardAction.java:228) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.start(TransportSingleShardAction.java:205) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:103) [elasticsearch-7.3.0.jar:7.3.0]
[2020-04-22T15:54:17,574][WARN ][o.e.c.r.a.AllocationService] [atl-cla-deves01] [.monitoring-kibana-7-2020.04.22][0] marking unavailable shards as stale: [VOA34f_1Tr6A4evPYstGFQ]
[2020-04-22T15:54:17,574][WARN ][o.e.c.r.a.AllocationService] [atl-cla-deves01] [.monitoring-logstash-7-2020.04.22][0] marking unavailable shards as stale: [vmkB0OVWTtCM29_5fNeCrg]
[2020-04-22T15:54:22,056][INFO ][o.e.c.r.a.AllocationService] [atl-cla-deves01] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-kibana-7-2020.04.22][0]] ...]).
[2020-04-22T15:54:22,539][WARN ][o.e.x.m.e.l.LocalExporter] [atl-cla-deves01] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: RemoteTransportException[[met-cla-deves03][10.188.0.223:9300][indices:data/write/bulk[s]]]; nested: CircuitBreakingException[[parent] Data too large,
data for [<transport_request>] would be [8376777120/7.8gb], which is larger than the limit of [8127315968/7.5gb], real usage: [8376769800/7.8gb], new bytes reserved: [7320/7.1kb], usages [request=0/0b, fielddata
=7020/6.8kb, in_flight_requests=7320/7.1kb, accounting=46573684/44.4mb]];
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$throwExportException$2(LocalBulk.java:125) ~[x-pack-monitoring-7.3.0.jar:7.3.0]
org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:145) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.finishHim(TransportBulkAction.java:473) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.onFailure(TransportBulkAction.java:468) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:74) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.finishAsFailed(TransportReplicationAction.java:822) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$1.handleException(TransportReplicationAction.java:780) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1111) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1111) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:246) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:193) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:244) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:236) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:139) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:105) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:660) [elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) [transport-netty4-client-7.3.0.jar:7.3.0]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [netty-codec-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
Caused by: org.elasticsearch.transport.RemoteTransportException: [met-cla-deves03][10.188.0.223:9300][indices:data/write/bulk[s]]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [8376777120/7.8gb], which is larger than the limit of [8127315968/7.5gb], real usage: [8376769800/7.8gb], new bytes reserved: [7320/7.1kb], usages [request=0/0b, fielddata=7020/6.8kb, in_flight_requests=7320/7.1kb, accounting=46573684/44.4mb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:342) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:173) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:121) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:105) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:660) ~[elasticsearch-7.3.0.jar:7.3.0]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:835) ~[?:?]
[2020-04-22T17:23:40,549][INFO ][o.e.c.m.MetaDataCreateIndexService] [atl-cla-deves01] [filebeat-7.3.2-2020.04.22-000008] creating index, cause [rollover_index], templates [filebeat-7.3.2], shards [1]/[1], mappings [_doc]
[2020-04-22T17:23:40,928][INFO ][o.e.c.r.a.AllocationService] [atl-cla-deves01] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[filebeat-7.3.2-2020.04.22-000008][0]] ...]).
[2020-04-22T17:50:34,754][WARN ][o.e.m.j.JvmGcMonitorService] [atl-cla-deves01] [gc][7012] overhead, spent [682ms] collecting in the last [1s]
[2020-04-22T17:51:44,119][INFO ][o.e.x.m.p.NativeController] [atl-cla-deves01] Native controller process has stopped - no new native processes can be started