Elasticsearch 1.5 cluster going to RED state due to some nodes constantly exiting and rejoining the cluster


(sarya) #1

We have been noticing our elasticsearch cluster is often going to RED state and queries are timing out .
We have noticed that some nodes have high JVM memory utilization and soon exiting the cluster ( from the logs we see out of memory errors on that node and it looks like the node cannot be reached by other cluster members / Master) .
The node eventually recovers after several hours by itself but this is causing an application outage due to the timeout .
We have set the property
indices.fielddata.cache.size: 40%

as per elasticsearch recommendation which we thought would avoid this problem . However it doesn't seem to work .
We also set
ES_MIN_MEM: 15g
ES_MAX_MEM: 15g
on our machines which have 30 GB RAM .

Any help appreciated .
Thanks


(Mark Walkom) #2

Do you have Marvel installed? If not I would start there, it'll tell you what is happening with a bit more certainty.


(sarya) #3

Yes , we have marvel installed , but what should we specifically look for this particular issue .
Thanks


(Mark Walkom) #4

Heap use and GC, that's what it sounds like.


(system) #5