Kibana state shange to red

sahere37 · April 17, 2019, 7:39am

hi all,
I have a cluster with three nodes (235, 236 ,237) on windows; kibana installed in all three servers so that each kibana connects to elasticsearch of its server.
sometimes the state of kibana changes from green to red. log of elasticsearch for some cases are as following:
based on .monitoring-kibana index, kibana state changes to red as following.
1- case 1:
kibana red: node-236 in 4-14-2019- 10:43, 10:44

 [2019-04-14T10:44:51,329][INFO ][o.e.d.z.ZenDiscovery     ] [node-236] master_left [{node-237}{ynuGMeSVRYKsa_-95s0YmA}{o1aAN6vDT1akB5HKXyhZrQ}{0.0.0.237}{0.0.0.237:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
 [2019-04-14T10:44:51,344][WARN ][o.e.d.z.ZenDiscovery     ] [node-236] master left (reason = failed to ping, tried [3] times, each with  maximum [30s] timeout), current nodes: nodes: {node-237}{ynuGMeSVRYKsa_-95s0YmA}{o1aAN6vDT1akB5HKXyhZrQ}{0.0.0.237}{0.0.0.237:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}, master{node-236}{UWi2vw4-QfqJ_5rDe-j80A}{Ww4jfFFiROu4h-yW-UZLsQ}{0.0.0.236}{0.0.0.236:9300}{ml.machine_memory=8589328384, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, local
 [2019-04-14T10:44:54,375][WARN ][o.e.d.z.ZenDiscovery     ] [node-236] not enough master nodes discovered during pinging (found [[Candidate{node={node-236}{UWi2vw4-QfqJ_5rDe-j80A}{Ww4jfFFiROu4h-yW-UZLsQ}{0.0.0.236}{0.0.0.236:9300}{ml.machine_memory=8589328384, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=67330}]], but needed [2]), pinging again
 [2019-04-14T10:44:55,125][WARN ][o.e.d.z.UnicastZenPing   ] [node-236] failed to send ping to [{node-237}{ynuGMeSVRYKsa_-95s0YmA}{o1aAN6vDT1akB5HKXyhZrQ}{0.0.0.237}{0.0.0.237:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}]org.elasticsearch.transport.ReceiveTimeoutTransportException: [node-237][0.0.0.237:9300][internal:discovery/zen/unicast] request_id [19137729] timed out after [3860ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1038) [elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_152]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_152]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_152]

2- case 2:

kibana red : node-236 in 4-14-2019- 10:46

 [2019-04-14T10:45:09,316][WARN ][r.suppressed             ] [node-236] path: /_xpack/monitoring/_bulk, params: {system_id=kibana, system_api_version=6, interval=10000ms}org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
 [2019-04-14T10:45:10,113][INFO ][o.e.c.s.ClusterApplierService] [node-236] detected_master {node-237}{ynuGMeSVRYKsa_-95s0YmA}{o1aAN6vDT1akB5HKXyhZrQ}{0.0.0.237}{0.0.0.237:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}, reason: apply cluster state (from master [master {node-237}{ynuGMeSVRYKsa_-95s0YmA}{o1aAN6vDT1akB5HKXyhZrQ}{0.0.0.237}{0.0.0.237:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} committed version [67331]])
 [2019-04-14T10:46:23,710][WARN ][o.e.m.j.JvmGcMonitorService] [node-236] [gc][old][86350][23] duration [22.9s], collections [1]/[23.7s], total [22.9s]/[36.7s], memory [2.7gb]->[1.5gb]/[3.9gb], all_pools {[young] [90.2mb]->[12.4mb]/[266.2mb]}{[survivor] [25.2mb]->[0b]/[33.2mb]}{[old] [2.6gb]->[1.5gb]/[3.6gb]}
 [2019-04-14T10:46:23,710][WARN ][o.e.m.j.JvmGcMonitorService] [node-236] [gc][86350] overhead, spent [23s] collecting in the last [23.7s]

3- case 3:

kibana red : node-237 in 4-14-2019- 01:47,01:48, 01:49

 [2019-04-14T01:47:35,887][INFO ][o.e.m.j.JvmGcMonitorService] [node-237] [gc][5306110] overhead, spent [388ms] collecting in the last [1.2s]
 [2019-04-14T01:47:54,451][INFO ][o.e.m.j.JvmGcMonitorService] [node-237] [gc][5306128] overhead, spent [315ms] collecting in the last [1.1s]
 [2019-04-14T01:48:27,181][WARN ][o.e.t.TransportService   ] [node-237] Received response for a request that has timed out, sent [30842ms] ago, timed out [705ms] ago, action [internal:discovery/zen/fd/ping], node [{node-235}{s2_LzCq1TxCWple_fIx9yg}{-q2P7Z8oQcW6d3uiOJaCtA}{0.0.0.235}{0.0.0.235:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], id [889561226]
 [2019-04-14T01:48:28,394][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [node-237] failed to execute on node [s2_LzCq1TxCWple_fIx9yg]org.elasticsearch.transport.ReceiveTimeoutTransportException: [node-235][0.0.0.235:9300][cluster:monitor/nodes/stats[n]] request_id [889579252] timed out after [15166ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1038) [elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_152]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_152]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_152]
 [2019-04-14T01:48:29,257][WARN ][o.e.t.TransportService   ] [node-237] Received response for a request that has timed out, sent [15981ms] ago, timed out [815ms] ago, action [cluster:monitor/nodes/stats[n]], node [{node-235}{s2_LzCq1TxCWple_fIx9yg}{-q2P7Z8oQcW6d3uiOJaCtA}{0.0.0.235}{0.0.0.235:9300}{ml.machine_memory=8589328384, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], id [889579252]
 [2019-04-14T01:48:31,084][WARN ][o.e.m.j.JvmGcMonitorService] [node-237] [gc][young][5306164][2494235] duration [1.2s], collections [1]/[1.3s], total [1.2s]/[19.8h], memory [3gb]->[2.8gb]/[3.9gb], all_pools {[young] [255.1mb]->[66.5mb]/[266.2mb]}{[survivor] [29.9mb]->[27.8mb]/[33.2mb]}{[old] [2.7gb]->[2.7gb]/[3.6gb]}
 [2019-04-14T01:48:36,126][WARN ][o.e.m.j.JvmGcMonitorService] [node-237] [gc][5306169] overhead, spent [519ms] collecting in the last [1s]
 [2019-04-14T01:49:59,288][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [node-237] collector [cluster_stats] timed out when collecting data

any advice will be so appreciated.

warkolm · April 18, 2019, 9:20pm

Looks like you have nodes dropping out due to excessive GC.

What version are you on?
How many shards and indices?

sahere37 · April 20, 2019, 4:05am

@warkolm many any thanks for your reply,

elk version is 6.5. totally, there are 161 indices, 729 primary shard and 729 replica shard, also there are 163306395 documents, size of data is 191.6 GB; assigned heap memory of each server (there are three servers) is 8GB. also, number of primary shards configured as 5 and number of replica shards configured as 1

system · May 18, 2019, 4:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana status become RED Kibana	10	438	December 10, 2020
Kibana flapping between red and green Kibana	10	12016	July 6, 2017
Red status elastic Elasticsearch	10	1469	December 6, 2017
Kibana Status Red due to Monitoring Plugin Elasticsearch	3	2046	March 30, 2017
Kibana 4.5.4 - Status changed from green to red - Request Timeout after 3000ms Kibana	7	2724	March 29, 2017

Kibana state shange to red

Related topics