Metrics fail after several hours

DucretJe · December 2, 2020, 9:21pm

Elasticsearch: Docker version 7.10
Kibana: Docker version 7.10
Metricbeat: Docker version 7.10

Hello,

I have a strange issue with my ELK deployment.
When I deploy Elasticsearch + Kibana, I get the metrics from the monitored server. The problem occurs several hours after the deployment (I don't know exactly but I'd say 10-12h)

When I try to access to the metrics fronm the metric tab I get an "Internal Server Error (500)"
URL
http://:5601/api/metrics/snapshot

When I want to see my dashboard ([Metricbeat System] Host overview ECS), I get a part of my metrics but I loose some of the panels, they display the following error:

The request for this panel failed
all shards failed

The following panels are failing:

Inbound Traffic
Outbound Traffic
Disk Usage
Network Traffic (Packets)
Network Traffic (Bytes)
Processes By Memory
Processess By CPU
Interfaces By Incomming traffic
Interfaces by outgoing traffic

I don't understand why it fails this way?

simianhacker · December 2, 2020, 10:37pm

Are there any errors in the Elasticsearch logs? My best guess is that when the index rolls over the mappings for the new indices are missing. That would explain why it starts out fine but then eventually starts failing after several hours.

DucretJe · December 2, 2020, 11:11pm

Hi,

Yes, I can see the stacktrace/

"stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:324) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:603) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:400) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:236) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:303) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.0.jar:7.10.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
"at java.lang.Thread.run(Thread.java:832) [?:?]"] }
{"type": "server", "timestamp": "2020-11-29T20:35:12,603+01:00", "level": "WARN", "component": "r.suppressed", "cluster.name": "docker-cluster", "node.name": "f6f831e1284f", "message": "path: /.kibana/_count, params: {i
ndex=.kibana}", "cluster.uuid": "ncuITa4nTK-wbRWRg3_3Lg", "node.id": "g3e15271RR-DQMIQiSTk3g" ,
"stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:568) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:324) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:603) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:400) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:236) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:303) [elasticsearch-7.10.0.jar:7.10.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-7.10.0.jar:7.10.0]",

"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.10.0.jar:7.10.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
"at java.lang.Thread.run(Thread.java:832) [?:?]"] }

I'm not very good with java...

warkolm · December 2, 2020, 11:14pm

Is there more to the log?

DucretJe · December 3, 2020, 8:30pm

Here is a log from Kibana.
Strange thing, no logs are generated by Elasticsearch??

http://pastebin.fr/pastebin.php?dl=76496

warkolm · December 3, 2020, 9:34pm

Can you please post a link to the text rather than a download.

DucretJe · December 3, 2020, 9:59pm

of course

DucretJe · December 6, 2020, 11:07am

How could I check this? And how cold I fix it?

simianhacker · December 7, 2020, 3:20pm

Can you paste the output from

GET metricbeat-*/_mapping

If you changed the pattern of your indices then adjust the query above. This will give us a listing of all the mappings for the indices queried for the Metrics UI.

system · January 4, 2021, 3:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch failing. Missing Shard. .kibana_task_manager_7.15.0_001 Elastic Cloud on Kubernetes (ECK) docker	4	4075	November 17, 2021
Request for this panel failed for metricbeat dashboard Beats	4	868	February 1, 2019
[Unresolved] 5 Metricbeat Kibana Dashboards show various "X of X Shards Failed" messages Beats metricbeat	4	1130	March 3, 2019
ELK stack - Kibana fails time to time giving the same error ( Elasticsearch failed Search rejected due to missing shards [)[.kibana_task_manager_7.17.7_001][0]]) Kibana docker	1	111	October 18, 2023
Not all logs are being sent on ES, requests are timing out Elasticsearch	1	345	November 25, 2021

Metrics fail after several hours

Related topics