Hi,
I have implemented ELK stack in a single server. But I have noticed that server becomes unresponsive after two three days. And when I checked the elasticsearch.log and gc.log.0.current, i could find an error every time before the server hangs. Following is the error.
[2018-11-09T10:19:10,182][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:20:40,511][ERROR][o.e.x.m.c.i.IndexStatsCollector] [SLT_Syslog] collector [index-stats] timed out when collecting data
[2018-11-09T10:21:40,507][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:22:20,699][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:22:50,781][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:23:10,805][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:24:26,104][WARN ][o.e.m.j.JvmGcMonitorService] [SLT_Syslog] [gc][young][35285][10524] duration [35.1s], collections [1]/[51.2s], total [35.1s]/[1
.5m], memory [485mb]->[307.6mb]/[990.7mb], all_pools {[young] [223.1mb]->[50.5mb]/[266.2mb]}{[survivor] [11.6mb]->[6.3mb]/[33.2mb]}{[old] [250.2mb]->[250.8mb]/[
691.2mb]}
[2018-11-09T10:25:21,205][WARN ][o.e.m.j.JvmGcMonitorService] [SLT_Syslog] [gc][35285] overhead, spent [35.1s] collecting in the last [51.2s]
[2018-11-09T10:27:41,340][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
[2018-11-09T10:28:41,341][ERROR][o.e.x.m.c.n.NodeStatsCollector] [SLT_Syslog] collector [node_stats] timed out when collecting data
Please help me to get this sorted out.
BR,
Billz