Hi all,
I have been trying to figure out a problem where my Kibana is not able to keep connections alive with the Elasticsearch instance, and I think it is because of red cluster/index health. When Kibana is running, I get the below messages spammed constantly in the Elasticsearch cluster logs on all nodes. This started the same day that one of the Elasticsearch agents crashed and has been continuing since.
[2023-02-24T08:34:56,661][WARN ][o.e.c.r.a.AllocationService] [NodeName] [.kibana-event-log-8.1.3-000010][0] marking unavailable shards as stale: [V_z_SX6bTbyL3xN_dUJXdA]
[2023-02-24T08:35:02,032][WARN ][o.e.c.r.a.AllocationService] [NodeName] [.ds-auditbeat-8.1.3-2023.01.28-000014][0] marking unavailable shards as stale: [y5ut-1ZZTVejQgoSiL5XZQ]
[2023-02-24T08:35:02,212][WARN ][r.suppressed ] [NodeName] path: /winlogbeat-,logs-endpoint.events.,logs-windows./_eql/search, params: {allow_no_indices=true, index=winlogbeat-,logs-endpoint.events.,logs-windows.}
org.elasticsearch.action.search.SearchPhaseExecutionException: start
at org.elasticsearch.action.search.CanMatchPreFilterSearchPhase.onPhaseFailure(CanMatchPreFilterSearchPhase.java:465) [elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.action.search.CanMatchPreFilterSearchPhase$1.onFailure(CanMatchPreFilterSearchPhase.java:454) [elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28) [elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33) [elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:776) [elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-8.1.3.jar:8.1.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException: Search rejected due to missing shards [[.ds-winlogbeat-8.1.3-2023.01.28-000013][2]]. Consider usingallow_partial_search_results
setting to bypass this error.
at org.elasticsearch.action.search.CanMatchPreFilterSearchPhase.checkNoMissingShards(CanMatchPreFilterSearchPhase.java:216) ~[elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.action.search.CanMatchPreFilterSearchPhase.run(CanMatchPreFilterSearchPhase.java:140) ~[elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.action.search.CanMatchPreFilterSearchPhase$1.doRun(CanMatchPreFilterSearchPhase.java:459) ~[elasticsearch-8.1.3.jar:8.1.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.1.3.jar:8.1.3]
... 6 more
This is a massive block that I can provide the whole error for, if necessary:
[2023-02-24T08:32:30,904][ERROR][o.e.x.e.p.RestEqlSearchAction] [NodeName] failed to send failure response
java.lang.IllegalStateException: Channel is already closed
...
Suppressed: java.lang.IllegalArgumentException: reader id must be specified
I randomly was able to log in to Kibana once after starting it up, but before it lost connection again, and saw that the .ds-winlogbeat-8.1.3-2023.01.28-000013 index is at Red health. I have found plenty of information on how to fix red health issues, but here is my problem: I cannot keep connected in Kibana for more than a few seconds (if that), and I do not have an API key generated aside from the limited permissions for beat agents. My feeling is that the red health is the root cause, but I am stumped at this point and desparate for any suggestions.