Hello,
We upgraded our ES version from 1.4 to 1.5 about 3 weeks ago. After this upgrade our master node started to go down due to memory problem once a week. Error message is below:
[DEBUG][action.admin.cluster.node.stats] [es_master_02] failed to execute on node [brirsjseReWgd7nSXaE0DQ]
org.elasticsearch.transport.SendRequestTransportException: [es_data_4][inet[ip-10-140-239-168.ec2.internal/10.140.239.168:9300]][cluster:monitor/nodes/stats[n]]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:213)
at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.start(TransportNodesOperationAction.java:165)
at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.access$300(TransportNodesOperationAction.java:97)
at org.elasticsearch.action.support.nodes.TransportNodesOperationAction.doExecute(TransportNodesOperationAction.java:70)
at org.elasticsearch.action.support.nodes.TransportNodesOperationAction.doExecute(TransportNodesOperationAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.cluster.InternalClusterInfoService$ClusterInfoUpdateJob.run(InternalClusterInfoService.java:260)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
We have 3 master eligible nodes and this one is the active master. It happened on other masters too when they are the active one. Other nodes don't get active and cluster state remains red until a restart when this error happens.
Is there anyone who have seen this kind of error before? Is it a issue about 1.5 or are we doing something wrong?
Thanks,
Umutcan