I have an elastic cluster running with 8 data nodes 2 client nodes and 3 master nodes. I frequently observe that CPU of one of the nodes(slightly random though) increases above 70% and keeps it to that level unless I restart the node.
During this time response time increases drastically. Everything comes to normal as soon as I restart.
I am facing problems in troubleshooting this. Could someone please suggest any pointers?
Could there be routing problem from client node?
I am running 2.3.5 version and have almost equal configuration for all the data nodes(m4.2xlarge).