ES 6.5.4 - Error : monitoring execution failed org.elasticsearch

Hi

in my new job i discover we use Elasticsearch cluser 6.5.4 and i am totaly newbie, i try to understand how it s work since many hours. since many day we encoutering many mistake

i discover my 3 node have same error in logfile

{
  "cluster_name" : "saas",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 9025,
  "active_shards" : 13037,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 5078,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 71.96798233508143
}

[2021-07-07T15:58:12,774][WARN ][o.e.x.m.e.l.LocalExporter] [lfrn1kibana04-node-1] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: UnavailableShardsException[[.monitoring-es-6-2021.07.07][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[.monitoring-es-6-2021.07.07][0]] containing [20671] requests]]
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$throwExportException$2(LocalBulk.java:128) ~[?:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_191]
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) ~[?:1.8.0_191]
        at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_191]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_191]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_191]
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) ~[?:1.8.0_191]
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) ~[?:1.8.0_191]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_191]
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) ~[?:1.8.0_191]
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.throwExportException(LocalBulk.java:129) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$0(LocalBulk.java:111) ~[?:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:85) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:81) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkRequestModifier.lambda$wrapActionListenerIfNeeded$0(TransportBulkAction.java:607) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.finishHim(TransportBulkAction.java:414) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.onFailure(TransportBulkAction.java:409) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.finishAsFailed(TransportReplicationAction.java:901) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retry(TransportReplicationAction.java:873) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:932) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryIfUnavailable(TransportReplicationAction.java:778) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:731) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:892) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:317) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:244) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:559) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: org.elasticsearch.action.UnavailableShardsException: [.monitoring-es-6-2021.07.07][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[.monitoring-es-6-2021.07.07][0]] containing [20671] requests]
        ... 12 more
[2021-07-07T15:58:12,774][WARN ][o.e.x.m.MonitoringService] [lfrn1kibana04-node-1] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.<init>(ExportBulk.java:95) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1.onFailure(ExportBulk.java:93) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:206) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:200) ~[?:?]
        at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:115) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:164) ~[?:?]
        at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.throwExportException(LocalBulk.java:135) ~[?:?]
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$0(LocalBulk.java:111) ~[?:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:85) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:81) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkRequestModifier.lambda$wrapActionListenerIfNeeded$0(TransportBulkAction.java:607) ~[elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.finishHim(TransportBulkAction.java:414) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.onFailure(TransportBulkAction.java:409) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.finishAsFailed(TransportReplicationAction.java:901) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retry(TransportReplicationAction.java:873) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:932) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryIfUnavailable(TransportReplicationAction.java:778) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:731) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:892) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:317) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:244) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:559) [elasticsearch-6.5.4.jar:6.5.4]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
        at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:156) ~[?:?]
        ... 26 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: bulk [default_local] reports failures when exporting documents
        at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.throwExportException(LocalBulk.java:124) ~[?:?]
        ... 24 more


My kibana can t read with this error

Request to Elasticsearch failed: {"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[]},"status":503}

Someone can help to debugg that ? i continue to read forum topic for learn about same issues
Thanks a lot

Your main issue is that you have too many most likely small shards in your cluster. I would recommend you read this blog post for some practical guidelines on sharding. I would recommend ryou to reduce the. number of shards in your cluster and this guide provides information about how to go about doing so.

Ok i understand better.

What can i do if indices is in yellow or red health do i need to

  • close
  • force index
  • clear
  • flush
  • delete

Thanks