We are running ES cluster in AWS Ec2 for past 2 years. But for past two months we are getting "OutOfMemoryError" Error from our data nodes. After this error one or more node went out from cluster and the cluster goes immediately red. But the CPU peformance of the server is not more than 80 at the time. At first we have only 5 data nodes, after this problem we add two more data nodes and set ciruit breaker, still we got the error. Please some one help to fix this error. Thanks in advance
Please find the below elasticseach details
Elasticsearch Version : 2.3.5
Client Node : 2 (RAM - 32 GB, Max Heap - 25 GB)
**Deticated Master Node : 3 **
Data Node : 7 (RAM - 16 GB, Max Heap - 10 GB)
Total Cluster Size : 850 GB
**Indices : 7 **
Total shards : 280 (20 shards for each index and 1 replica)
indices.breaker.total.limit: 80%
indices.breaker.fielddata.limit: 60%
indices.breaker.request.limit: 40%
We had checked our logs and track data, there is no big searches or indexing at the time of cluster down. In idle time too, the heap of the data nodes are nearly 60-70%.
Error:
[2018-04-04 03:01:58,915][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
at org.jboss.netty.buffer.HeapChannelBuffer.(HeapChannelBuffer.java:42)
at org.jboss.netty.buffer.BigEndianHeapChannelBuffer.(BigEndianHeapChannelBuffer.java:34)
at org.jboss.netty.buffer.ChannelBuffers.buffer(ChannelBuffers.java:134)
at org.jboss.netty.buffer.HeapChannelBufferFactory.getBuffer(HeapChannelBufferFactory.java:68)
at org.jboss.netty.buffer.AbstractChannelBufferFactory.getBuffer(AbstractChannelBufferFactory.java:48)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:80)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)