I have 4 .net services that use elasticsearch for indexing and querying documents.
From time to time .net services report timeout errors connecting to elasticsearch and during that time kibana also cant connect to elastic. If i check elastic cluster at those times they dont log any metric and looks like is not a problem with cpu or memory high consumption
when looking at the logs i see this warnings:
{"@timestamp":"2024-02-28T12:23:34.583Z", "log.level": "WARN", "message":"block until refresh ran out of slots and forced a refresh: [BulkShardRequest [[admin_tasks-2024.02][0]] containing [index {[admin_tasks-2024.02][f79b5c13-7273-4f7a-b940-de39f7a63ef4], source[{\"startAt\":\"2024-02-28T12:23:20.3958202Z\",\"lastUpdateAt\":\"2024-02-28T12:23:20.3958202Z\",\"action\":31,\"status\":1,\"executionTimeout\":9223372036854775807,\"proxyConfigurationId\":\"ca930274-f400-47c1-866f-0a2be93534b7\",\"organizationId\":\"d51fec8b-8742-41ca-bf2a-a7657b863526\",\"id\":\"f79b5c13-7273-4f7a-b940-de39f7a63ef4\"}]}] blocking until refresh]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-es-master-1][write][T#1]","log.logger":"org.elasticsearch.action.bulk.TransportShardBulkAction","trace.id":"8c00e06ad3844e0ab0338c97aa88a105","elasticsearch.cluster.uuid":"monZA5VJRTOM7JUtepfP9A","elasticsearch.node.id":"OHSa8hCNQ3uyL-ouWtwHpg","elasticsearch.node.name":"elasticsearch-es-master-1","elasticsearch.cluster.name":"elasticsearch"}
> {"@timestamp":"2024-02-28T14:19:56.627Z", "log.level": "WARN", "message":"block until refresh ran out of slots and forced a refresh: [BulkShardRequest [[admin_tasks-2024.02][0]] containing [index {[admin_tasks-2024.02][4c8f6d14-383f-4719-91c9-644cf4ce8572], source[{\"startAt\":\"2024-02-28T14:16:55.4137373Z\",\"lastUpdateAt\":\"2024-02-28T14:16:55.4137373Z\",\"action\":41,\"status\":2,\"resultMessage\":\"Total: 0\",\"executionTimeout\":0,\"agentUUID\":\"4C4C4544-0033-3710-8047-B4C04F375432\",\"organizationId\":\"85afb8b7-8f7a-400c-9fe4-48ebffe08172\",\"id\":\"4c8f6d14-383f-4719-91c9-644cf4ce8572\"}]}] blocking until refresh]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-es-master-1][write][T#2]","log.logger":"org.elasticsearch.action.bulk.TransportShardBulkAction","trace.id":"3542621bd86dfd6f886f3bd43eab525e","elasticsearch.cluster.uuid":"monZA5VJRTOM7JUtepfP9A","elasticsearch.node.id":"OHSa8hCNQ3uyL-ouWtwHpg","elasticsearch.node.name":"elasticsearch-es-master-1","elasticsearch.cluster.name":"elasticsearch"}
> {"@timestamp":"2024-02-28T14:19:56.760Z", "log.level": "WARN", "message":"block until refresh ran out of slots and forced a refresh: [BulkShardRequest [[admin_tasks-2024.02][0]] containing [index {[admin_tasks-2024.02][81a3665a-aa95-43bf-8f1d-fe828d1d74e8], source[{\"startAt\":\"2024-02-28T14:16:57.3878789Z\",\"lastUpdateAt\":\"2024-02-28T14:16:57.3878789Z\",\"action\":41,\"status\":2,\"resultMessage\":\"Total: 0\",\"executionTimeout\":0,\"agentUUID\":\"DD595EFE-3234-4E8C-B4BF-CE96B0CB4227\",\"organizationId\":\"85afb8b7-8f7a-400c-9fe4-48ebffe08172\",\"id\":\"81a3665a-aa95-43bf-8f1d-fe828d1d74e8\"}]}] blocking until refresh]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-es-master-1][write][T#1]","log.logger":"org.elasticsearch.action.bulk.TransportShardBulkAction","trace.id":"4c0b8a90b33eb0cc38dd483b14d026c7","elasticsearch.cluster.uuid":"monZA5VJRTOM7JUtepfP9A","elasticsearch.node.id":"OHSa8hCNQ3uyL-ouWtwHpg","elasticsearch.node.name":"elasticsearch-es-master-1","elasticsearch.cluster.name":"elasticsearch"}
any ideas what can cause elastic to stop responding? how can i find out what is causing this behaviour from time to time?
Stopping and restarting all my services seems to fix the issue for some time but after some time same problem.