PROBLEM SOLVED!
Kibana conducts a health check every few seconds by querying the nodes API (/nodes
) of Elasticsearch. If this does not respond quickly enough, Kibana will go red
.
It has been discovered that this is a very costly API call to make (Kibana causes off heap memory problems on elasticsearch masters. · Issue #16733 · elastic/kibana · GitHub) and the health check is in the process of being removed by the Kibana team (Remove the Health Check · Issue #14163 · elastic/kibana · GitHub).
Until then, we've taken a few counter-measures:
- Increased the time between health checks to 1 hour,
- Increased the size of the servers in the cluster(especially data node), and
[2018-05-04T08:49:12,857][WARN ][o.e.m.j.JvmGcMonitorService] [ip-10-50-40-233] [gc][4448917] overhead, spent [899ms] collecting in the last [1s]
[2018-05-04T08:49:13,930][WARN ][o.e.m.j.JvmGcMonitorService] [ip-10-50-40-233] [gc][4448918] overhead, spent [921ms] collecting in the last [1s]
- Introduced HAProxy to rewrite
/nodes
to/nodes/_local
, a local endpoint that doesn't go to the cluster masters.
HAproxy config:
frontend elastic-in
bind :9201
default_backend elastic-out
backend elastic-out
http-request set-path /_nodes/_local if { path_beg /_nodes }
server localhost 10.50.30.150:9200