Metrics fail after several hours

Elasticsearch: Docker version 7.10
Kibana: Docker version 7.10
Metricbeat: Docker version 7.10

Hello,

I have a strange issue with my ELK deployment.
When I deploy Elasticsearch + Kibana, I get the metrics from the monitored server. The problem occurs several hours after the deployment (I don't know exactly but I'd say 10-12h)

When I try to access to the metrics fronm the metric tab I get an "Internal Server Error (500)"
URL
http://:5601/api/metrics/snapshot

When I want to see my dashboard ([Metricbeat System] Host overview ECS), I get a part of my metrics but I loose some of the panels, they display the following error:

The request for this panel failed
all shards failed

The following panels are failing:

  • Inbound Traffic
  • Outbound Traffic
  • Disk Usage
  • Network Traffic (Packets)
  • Network Traffic (Bytes)
  • Processes By Memory
  • Processess By CPU
  • Interfaces By Incomming traffic
  • Interfaces by outgoing traffic

I don't understand why it fails this way?

Are there any errors in the Elasticsearch logs? My best guess is that when the index rolls over the mappings for the new indices are missing. That would explain why it starts out fine but then eventually starts failing after several hours.

Hi,

Yes, I can see the stacktrace/

"stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:324) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:603) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:400) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:236) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:303) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.0.jar:7.10.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
"at java.lang.Thread.run(Thread.java:832) [?:?]"] }
{"type": "server", "timestamp": "2020-11-29T20:35:12,603+01:00", "level": "WARN", "component": "r.suppressed", "cluster.name": "docker-cluster", "node.name": "f6f831e1284f", "message": "path: /.kibana/_count, params: {i
ndex=.kibana}", "cluster.uuid": "ncuITa4nTK-wbRWRg3_3Lg", "node.id": "g3e15271RR-DQMIQiSTk3g" ,
"stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:324) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:603) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:400) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:236) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:303) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-7.10.0.jar:7.10.0]",

"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.0.jar:7.10.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
"at java.lang.Thread.run(Thread.java:832) [?:?]"] }

I'm not very good with java...

Is there more to the log?

Here is a log from Kibana.
Strange thing, no logs are generated by Elasticsearch??

http://pastebin.fr/pastebin.php?dl=76496

Can you please post a link to the text rather than a download.

of course

How could I check this? And how cold I fix it?

Can you paste the output from

GET metricbeat-*/_mapping

If you changed the pattern of your indices then adjust the query above. This will give us a listing of all the mappings for the indices queried for the Metrics UI.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.