We have been noticing our elasticsearch cluster is often going to RED state and queries are timing out .
We have noticed that some nodes have high JVM memory utilization and soon exiting the cluster ( from the logs we see out of memory errors on that node and it looks like the node cannot be reached by other cluster members / Master) .
The node eventually recovers after several hours by itself but this is causing an application outage due to the timeout .
We have set the property
indices.fielddata.cache.size: 40%
as per elasticsearch recommendation which we thought would avoid this problem . However it doesn't seem to work .
We also set
ES_MIN_MEM: 15g
ES_MAX_MEM: 15g
on our machines which have 30 GB RAM .
Any help appreciated .
Thanks